{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 04_Train_TransR\n",
    "#\n",
    "# created by LuYF-Lemon-love <luyanfeng_nlp@qq.com> on February 27, 2023\n",
    "# updated by LuYF-Lemon-love <luyanfeng_nlp@qq.com> on February 27, 2023\n",
    "#\n",
    "# 该脚本展示了如何在 DRKG 上训练模型 (TransR), 并利用网格搜索寻找到最优参数.\n",
    "#\n",
    "# 需要的包:\n",
    "#          torch\n",
    "#          dgl, version: 0.4.3\n",
    "#          dglke\n",
    "#          numpy\n",
    "#\n",
    "# 需要的文件:\n",
    "#          ./dataset\n",
    "#\n",
    "# 源教程链接: https://github.com/gnn4dr/DRKG/blob/master/embedding_analysis/Train_embeddings.ipynb"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Training DRKG Using TransR\n",
    "\n",
    "这个 notebook 展示了如何在 DRKG 上训练模型 (TransR), 并利用网格搜索寻找到最优参数."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 导入需要的库"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 网格搜索参数\n",
    "\n",
    "我们能使用 DGL-KE 命令训练 TransR 模型, 关于如何使用 DGL-KE 的更多信息请参考 https://github.com/awslabs/dgl-ke.\n",
    "\n",
    "这里我们使用两个 GPU 训练模型.\n",
    "\n",
    "大约 100000 * 2.7 / 3600 = 75.0 h"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: **200**, 400\n",
    "\n",
    "- gamma: **6**, 12, 18\n",
    "\n",
    "- lr: **0.01**, 0.05, 0.1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Reading train triples....\n",
      "Finished. Read 5286834 train triples.\n",
      "Reading valid triples....\n",
      "Finished. Read 293713 valid triples.\n",
      "Reading test triples....\n",
      "Finished. Read 293714 test triples.\n",
      "|Train|: 5286834\n",
      "random partition 5286834 edges into 2 parts\n",
      "part 0 has 2643417 edges\n",
      "part 1 has 2643417 edges\n",
      "/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/site-packages/dgl/base.py:25: UserWarning: multigraph will be deprecated.DGL will treat all graphs as multigraph in the future.\n",
      "  warnings.warn(msg, warn_type)\n",
      "|valid|: 293713\n",
      "|test|: 293714\n",
      "Total initialize time 16.306 seconds\n",
      "[proc 1][Train](1/100000) average pos_loss: 36.63789749145508\n",
      "[proc 1][Train](1/100000) average neg_loss: 0.006224163342267275\n",
      "[proc 1][Train](1/100000) average loss: 18.32206153869629\n",
      "[proc 1][Train](1/100000) average regularization: 4.492153038881952e-06\n",
      "[proc 1][Train] 1 steps take 5.332 seconds\n",
      "[proc 1]sample: 0.235, forward: 2.697, backward: 0.097, update: 2.302\n",
      "[proc 0][Train](1/100000) average pos_loss: 36.601043701171875\n",
      "[proc 0][Train](1/100000) average neg_loss: 0.0046770223416388035\n",
      "[proc 0][Train](1/100000) average loss: 18.302860260009766\n",
      "[proc 0][Train](1/100000) average regularization: 4.475013611227041e-06\n",
      "[proc 0][Train] 1 steps take 6.096 seconds\n",
      "[proc 0]sample: 0.268, forward: 2.793, backward: 0.087, update: 2.948\n",
      "[proc 1][Train](2/100000) average pos_loss: 29.84006118774414\n",
      "[proc 1][Train](2/100000) average neg_loss: 0.0073224445804953575\n",
      "[proc 1][Train](2/100000) average loss: 14.923691749572754\n",
      "[proc 1][Train](2/100000) average regularization: 3.6148576327832416e-06\n",
      "[proc 1][Train] 1 steps take 2.854 seconds\n",
      "[proc 1]sample: 0.184, forward: 0.423, backward: 0.072, update: 2.174\n",
      "[proc 0][Train](2/100000) average pos_loss: 27.10210418701172\n",
      "[proc 0][Train](2/100000) average neg_loss: 0.007338757626712322\n",
      "[proc 0][Train](2/100000) average loss: 13.55472183227539\n",
      "[proc 0][Train](2/100000) average regularization: 2.9746299787802855e-06\n",
      "[proc 0][Train] 1 steps take 2.853 seconds\n",
      "[proc 0]sample: 0.149, forward: 0.438, backward: 0.072, update: 2.194\n",
      "[proc 1][Train](3/100000) average pos_loss: 24.835981369018555\n",
      "[proc 1][Train](3/100000) average neg_loss: 0.01369731966406107\n",
      "[proc 1][Train](3/100000) average loss: 12.42483901977539\n",
      "[proc 1][Train](3/100000) average regularization: 2.7258292902843095e-06\n",
      "[proc 1][Train] 1 steps take 2.670 seconds\n",
      "[proc 1]sample: 0.006, forward: 0.432, backward: 0.070, update: 2.162\n",
      "[proc 0][Train](3/100000) average pos_loss: 23.119762420654297\n",
      "[proc 0][Train](3/100000) average neg_loss: 0.007119415327906609\n",
      "[proc 0][Train](3/100000) average loss: 11.563441276550293\n",
      "[proc 0][Train](3/100000) average regularization: 2.5231531708413968e-06\n",
      "[proc 0][Train] 1 steps take 2.648 seconds\n",
      "[proc 0]sample: 0.003, forward: 0.432, backward: 0.070, update: 2.142\n",
      "[proc 1][Train](4/100000) average pos_loss: 20.994426727294922\n",
      "[proc 1][Train](4/100000) average neg_loss: 0.004906617570668459\n",
      "[proc 1][Train](4/100000) average loss: 10.499666213989258\n",
      "[proc 1][Train](4/100000) average regularization: 2.329000380996149e-06\n",
      "[proc 1][Train] 1 steps take 2.855 seconds\n",
      "[proc 1]sample: 0.004, forward: 0.531, backward: 0.070, update: 2.250\n",
      "[proc 0][Train](4/100000) average pos_loss: 19.765216827392578\n",
      "[proc 0][Train](4/100000) average neg_loss: 0.010961983352899551\n",
      "[proc 0][Train](4/100000) average loss: 9.888089179992676\n",
      "[proc 0][Train](4/100000) average regularization: 2.1910129817115376e-06\n",
      "[proc 0][Train] 1 steps take 2.667 seconds\n",
      "[proc 0]sample: 0.003, forward: 0.452, backward: 0.069, update: 2.143\n",
      "[proc 1][Train](5/100000) average pos_loss: 18.294017791748047\n",
      "[proc 1][Train](5/100000) average neg_loss: 0.009284732863307\n",
      "[proc 1][Train](5/100000) average loss: 9.151651382446289\n",
      "[proc 1][Train](5/100000) average regularization: 2.069839865725953e-06\n",
      "[proc 1][Train] 1 steps take 2.536 seconds\n",
      "[proc 1]sample: 0.005, forward: 0.486, backward: 0.071, update: 1.974\n",
      "[proc 0][Train](5/100000) average pos_loss: 18.406902313232422\n",
      "[proc 0][Train](5/100000) average neg_loss: 0.007661604322493076\n",
      "[proc 0][Train](5/100000) average loss: 9.207282066345215\n",
      "[proc 0][Train](5/100000) average regularization: 2.039915443674545e-06\n",
      "[proc 0][Train] 1 steps take 2.762 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.434, backward: 0.071, update: 2.254\n",
      "[proc 1][Train](6/100000) average pos_loss: 16.270174026489258\n",
      "[proc 1][Train](6/100000) average neg_loss: 0.014494983479380608\n",
      "[proc 1][Train](6/100000) average loss: 8.142334938049316\n",
      "[proc 1][Train](6/100000) average regularization: 1.8808847244145e-06\n",
      "[proc 1][Train] 1 steps take 2.714 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.441, backward: 0.069, update: 2.201\n",
      "[proc 0][Train](6/100000) average pos_loss: 15.351273536682129\n",
      "[proc 0][Train](6/100000) average neg_loss: 0.013798362575471401\n",
      "[proc 0][Train](6/100000) average loss: 7.6825361251831055\n",
      "[proc 0][Train](6/100000) average regularization: 1.7957512454813696e-06\n",
      "[proc 0][Train] 1 steps take 2.662 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.410, backward: 0.071, update: 2.179\n",
      "[proc 1][Train](7/100000) average pos_loss: 14.346282958984375\n",
      "[proc 1][Train](7/100000) average neg_loss: 0.01794321835041046\n",
      "[proc 1][Train](7/100000) average loss: 7.182113170623779\n",
      "[proc 1][Train](7/100000) average regularization: 1.7084098544728477e-06\n",
      "[proc 1][Train] 1 steps take 2.652 seconds\n",
      "[proc 1]sample: 0.004, forward: 0.436, backward: 0.070, update: 2.143\n",
      "[proc 0][Train](7/100000) average pos_loss: 13.660019874572754\n",
      "[proc 0][Train](7/100000) average neg_loss: 0.02514929324388504\n",
      "[proc 0][Train](7/100000) average loss: 6.842584609985352\n",
      "[proc 0][Train](7/100000) average regularization: 1.645865268073976e-06\n",
      "[proc 0][Train] 1 steps take 2.741 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.432, backward: 0.070, update: 2.236\n",
      "[proc 1][Train](8/100000) average pos_loss: 13.00410270690918\n",
      "[proc 1][Train](8/100000) average neg_loss: 0.027249913662672043\n",
      "[proc 1][Train](8/100000) average loss: 6.515676498413086\n",
      "[proc 1][Train](8/100000) average regularization: 1.5854625416977797e-06\n",
      "[proc 1][Train] 1 steps take 3.350 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.423, backward: 0.069, update: 2.856\n",
      "[proc 0][Train](8/100000) average pos_loss: 12.476690292358398\n",
      "[proc 0][Train](8/100000) average neg_loss: 0.032642293721437454\n",
      "[proc 0][Train](8/100000) average loss: 6.254666328430176\n",
      "[proc 0][Train](8/100000) average regularization: 1.5416231917697587e-06\n",
      "[proc 0][Train] 1 steps take 2.689 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.425, backward: 0.071, update: 2.192\n",
      "[proc 1][Train](9/100000) average pos_loss: 11.975502967834473\n",
      "[proc 1][Train](9/100000) average neg_loss: 0.057034146040678024\n",
      "[proc 1][Train](9/100000) average loss: 6.016268730163574\n",
      "[proc 1][Train](9/100000) average regularization: 1.4930360521248076e-06\n",
      "[proc 1][Train] 1 steps take 2.676 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.473, backward: 0.070, update: 2.131\n",
      "[proc 0][Train](9/100000) average pos_loss: 11.550851821899414\n",
      "[proc 0][Train](9/100000) average neg_loss: 0.061328910291194916\n",
      "[proc 0][Train](9/100000) average loss: 5.806090354919434\n",
      "[proc 0][Train](9/100000) average regularization: 1.4678298612125218e-06\n",
      "[proc 0][Train] 1 steps take 2.916 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.518, backward: 0.070, update: 2.326\n",
      "[proc 1][Train](10/100000) average pos_loss: 10.6336088180542\n",
      "[proc 1][Train](10/100000) average neg_loss: 0.05582984909415245\n",
      "[proc 1][Train](10/100000) average loss: 5.344719409942627\n",
      "[proc 1][Train](10/100000) average regularization: 1.4169390851748176e-06\n",
      "[proc 1][Train] 1 steps take 2.706 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.475, backward: 0.070, update: 2.159\n",
      "[proc 0][Train](10/100000) average pos_loss: 10.94680404663086\n",
      "[proc 0][Train](10/100000) average neg_loss: 0.05812148004770279\n",
      "[proc 0][Train](10/100000) average loss: 5.502462863922119\n",
      "[proc 0][Train](10/100000) average regularization: 1.400591827405151e-06\n",
      "[proc 0][Train] 1 steps take 2.673 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.440, backward: 0.069, update: 2.162\n",
      "[proc 1][Train](11/100000) average pos_loss: 9.893590927124023\n",
      "[proc 1][Train](11/100000) average neg_loss: 0.09049514681100845\n",
      "[proc 1][Train](11/100000) average loss: 4.9920430183410645\n",
      "[proc 1][Train](11/100000) average regularization: 1.3120397852617316e-06\n",
      "[proc 1][Train] 1 steps take 2.997 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.463, backward: 0.069, update: 2.462\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](11/100000) average pos_loss: 9.713321685791016\n",
      "[proc 0][Train](11/100000) average neg_loss: 0.09249696135520935\n",
      "[proc 0][Train](11/100000) average loss: 4.902909278869629\n",
      "[proc 0][Train](11/100000) average regularization: 1.2937773590238066e-06\n",
      "[proc 0][Train] 1 steps take 2.837 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.434, backward: 0.070, update: 2.331\n",
      "[proc 1][Train](12/100000) average pos_loss: 9.355192184448242\n",
      "[proc 1][Train](12/100000) average neg_loss: 0.08135785162448883\n",
      "[proc 1][Train](12/100000) average loss: 4.71827507019043\n",
      "[proc 1][Train](12/100000) average regularization: 1.2754463796227355e-06\n",
      "[proc 1][Train] 1 steps take 2.964 seconds\n",
      "[proc 1]sample: 0.003, forward: 0.540, backward: 0.071, update: 2.350\n",
      "[proc 0][Train](12/100000) average pos_loss: 9.171796798706055\n",
      "[proc 0][Train](12/100000) average neg_loss: 0.09549665451049805\n",
      "[proc 0][Train](12/100000) average loss: 4.6336469650268555\n",
      "[proc 0][Train](12/100000) average regularization: 1.300097324019589e-06\n",
      "[proc 0][Train] 1 steps take 2.877 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.481, backward: 0.072, update: 2.323\n",
      "[proc 1][Train](13/100000) average pos_loss: 8.42745590209961\n",
      "[proc 1][Train](13/100000) average neg_loss: 0.1471402645111084\n",
      "[proc 1][Train](13/100000) average loss: 4.287298202514648\n",
      "[proc 1][Train](13/100000) average regularization: 1.223890080837009e-06\n",
      "[proc 1][Train] 1 steps take 2.679 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.457, backward: 0.069, update: 2.151\n",
      "[proc 0][Train](13/100000) average pos_loss: 8.34461498260498\n",
      "[proc 0][Train](13/100000) average neg_loss: 0.14797209203243256\n",
      "[proc 0][Train](13/100000) average loss: 4.246293544769287\n",
      "[proc 0][Train](13/100000) average regularization: 1.2257295338713448e-06\n",
      "[proc 0][Train] 1 steps take 2.909 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.444, backward: 0.069, update: 2.394\n",
      "[proc 1][Train](14/100000) average pos_loss: 8.125677108764648\n",
      "[proc 1][Train](14/100000) average neg_loss: 0.14386668801307678\n",
      "[proc 1][Train](14/100000) average loss: 4.134771823883057\n",
      "[proc 1][Train](14/100000) average regularization: 1.2110123179809307e-06\n",
      "[proc 1][Train] 1 steps take 2.794 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.478, backward: 0.070, update: 2.243\n",
      "[proc 0][Train](14/100000) average pos_loss: 8.065876007080078\n",
      "[proc 0][Train](14/100000) average neg_loss: 0.13409805297851562\n",
      "[proc 0][Train](14/100000) average loss: 4.099987030029297\n",
      "[proc 0][Train](14/100000) average regularization: 1.2145261507612304e-06\n",
      "[proc 0][Train] 1 steps take 2.765 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.439, backward: 0.070, update: 2.255\n",
      "[proc 1][Train](15/100000) average pos_loss: 7.324425220489502\n",
      "[proc 1][Train](15/100000) average neg_loss: 0.20663751661777496\n",
      "[proc 1][Train](15/100000) average loss: 3.765531301498413\n",
      "[proc 1][Train](15/100000) average regularization: 1.1733926612578216e-06\n",
      "[proc 1][Train] 1 steps take 2.845 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.445, backward: 0.070, update: 2.329\n",
      "[proc 0][Train](15/100000) average pos_loss: 7.393374443054199\n",
      "[proc 0][Train](15/100000) average neg_loss: 0.20941519737243652\n",
      "[proc 0][Train](15/100000) average loss: 3.8013949394226074\n",
      "[proc 0][Train](15/100000) average regularization: 1.191991827909078e-06\n",
      "[proc 0][Train] 1 steps take 2.793 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.422, backward: 0.069, update: 2.299\n",
      "[proc 1][Train](16/100000) average pos_loss: 6.981203079223633\n",
      "[proc 1][Train](16/100000) average neg_loss: 0.2039966732263565\n",
      "[proc 1][Train](16/100000) average loss: 3.592599868774414\n",
      "[proc 1][Train](16/100000) average regularization: 1.158442046289565e-06\n",
      "[proc 1][Train] 1 steps take 2.717 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.523, backward: 0.070, update: 2.121\n",
      "[proc 0][Train](16/100000) average pos_loss: 7.1572489738464355\n",
      "[proc 0][Train](16/100000) average neg_loss: 0.17176035046577454\n",
      "[proc 0][Train](16/100000) average loss: 3.6645047664642334\n",
      "[proc 0][Train](16/100000) average regularization: 1.173939722320938e-06\n",
      "[proc 0][Train] 1 steps take 3.041 seconds\n",
      "[proc 0]sample: 0.012, forward: 0.795, backward: 0.070, update: 2.163\n",
      "[proc 1][Train](17/100000) average pos_loss: 6.545874118804932\n",
      "[proc 1][Train](17/100000) average neg_loss: 0.28687605261802673\n",
      "[proc 1][Train](17/100000) average loss: 3.416375160217285\n",
      "[proc 1][Train](17/100000) average regularization: 1.1478965689093457e-06\n",
      "[proc 1][Train] 1 steps take 2.760 seconds\n",
      "[proc 1]sample: 0.016, forward: 0.419, backward: 0.070, update: 2.255\n",
      "[proc 0][Train](17/100000) average pos_loss: 6.3651018142700195\n",
      "[proc 0][Train](17/100000) average neg_loss: 0.28248366713523865\n",
      "[proc 0][Train](17/100000) average loss: 3.3237926959991455\n",
      "[proc 0][Train](17/100000) average regularization: 1.1227334653085563e-06\n",
      "[proc 0][Train] 1 steps take 2.700 seconds\n",
      "[proc 0]sample: 0.016, forward: 0.433, backward: 0.070, update: 2.181\n",
      "[proc 1][Train](18/100000) average pos_loss: 6.091341972351074\n",
      "[proc 1][Train](18/100000) average neg_loss: 0.24829214811325073\n",
      "[proc 1][Train](18/100000) average loss: 3.1698169708251953\n",
      "[proc 1][Train](18/100000) average regularization: 1.1783088211814174e-06\n",
      "[proc 1][Train] 1 steps take 3.076 seconds\n",
      "[proc 1]sample: 0.016, forward: 0.476, backward: 0.070, update: 2.514\n",
      "[proc 0][Train](18/100000) average pos_loss: 6.212044715881348\n",
      "[proc 0][Train](18/100000) average neg_loss: 0.243007630109787\n",
      "[proc 0][Train](18/100000) average loss: 3.2275261878967285\n",
      "[proc 0][Train](18/100000) average regularization: 1.1600467360040057e-06\n",
      "[proc 0][Train] 1 steps take 2.783 seconds\n",
      "[proc 0]sample: 0.021, forward: 0.445, backward: 0.069, update: 2.248\n",
      "[proc 1][Train](19/100000) average pos_loss: 5.959550857543945\n",
      "[proc 1][Train](19/100000) average neg_loss: 0.29410886764526367\n",
      "[proc 1][Train](19/100000) average loss: 3.1268298625946045\n",
      "[proc 1][Train](19/100000) average regularization: 1.1281489378234255e-06\n",
      "[proc 1][Train] 1 steps take 2.825 seconds\n",
      "[proc 1]sample: 0.003, forward: 0.462, backward: 0.070, update: 2.289\n",
      "[proc 0][Train](19/100000) average pos_loss: 5.868434906005859\n",
      "[proc 0][Train](19/100000) average neg_loss: 0.2916373908519745\n",
      "[proc 0][Train](19/100000) average loss: 3.080036163330078\n",
      "[proc 0][Train](19/100000) average regularization: 1.112359086619108e-06\n",
      "[proc 0][Train] 1 steps take 2.818 seconds\n",
      "[proc 0]sample: 0.007, forward: 0.452, backward: 0.070, update: 2.289\n",
      "[proc 1][Train](20/100000) average pos_loss: 5.633501052856445\n",
      "[proc 1][Train](20/100000) average neg_loss: 0.26164573431015015\n",
      "[proc 1][Train](20/100000) average loss: 2.94757342338562\n",
      "[proc 1][Train](20/100000) average regularization: 1.1875150676132762e-06\n",
      "[proc 1][Train] 1 steps take 2.921 seconds\n",
      "[proc 1]sample: 0.003, forward: 0.501, backward: 0.071, update: 2.346\n",
      "[proc 0][Train](20/100000) average pos_loss: 5.612407684326172\n",
      "[proc 0][Train](20/100000) average neg_loss: 0.2752316892147064\n",
      "[proc 0][Train](20/100000) average loss: 2.943819761276245\n",
      "[proc 0][Train](20/100000) average regularization: 1.172965767182177e-06\n",
      "[proc 0][Train] 1 steps take 2.784 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.475, backward: 0.069, update: 2.238\n",
      "[proc 1][Train](21/100000) average pos_loss: 5.367003440856934\n",
      "[proc 1][Train](21/100000) average neg_loss: 0.2931874096393585\n",
      "[proc 1][Train](21/100000) average loss: 2.8300955295562744\n",
      "[proc 1][Train](21/100000) average regularization: 1.1814495337603148e-06\n",
      "[proc 1][Train] 1 steps take 2.844 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.473, backward: 0.071, update: 2.298\n",
      "[proc 0][Train](21/100000) average pos_loss: 5.303811073303223\n",
      "[proc 0][Train](21/100000) average neg_loss: 0.2923874258995056\n",
      "[proc 0][Train](21/100000) average loss: 2.7980992794036865\n",
      "[proc 0][Train](21/100000) average regularization: 1.1334728924339288e-06\n",
      "[proc 0][Train] 1 steps take 2.799 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.477, backward: 0.069, update: 2.251\n",
      "[proc 0][Train](22/100000) average pos_loss: 5.062108039855957\n",
      "[proc 0][Train](22/100000) average neg_loss: 0.2892957329750061\n",
      "[proc 0][Train](22/100000) average loss: 2.675701856613159\n",
      "[proc 0][Train](22/100000) average regularization: 1.263408307750069e-06\n",
      "[proc 0][Train] 1 steps take 2.896 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.481, backward: 0.069, update: 2.343\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 1][Train](22/100000) average pos_loss: 5.2835798263549805\n",
      "[proc 1][Train](22/100000) average neg_loss: 0.26237356662750244\n",
      "[proc 1][Train](22/100000) average loss: 2.7729766368865967\n",
      "[proc 1][Train](22/100000) average regularization: 1.26339728012681e-06\n",
      "[proc 1][Train] 1 steps take 2.971 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.494, backward: 0.070, update: 2.404\n",
      "[proc 1][Train](23/100000) average pos_loss: 5.034053802490234\n",
      "[proc 1][Train](23/100000) average neg_loss: 0.3313691318035126\n",
      "[proc 1][Train](23/100000) average loss: 2.682711362838745\n",
      "[proc 1][Train](23/100000) average regularization: 1.2029570370941656e-06\n",
      "[proc 1][Train] 1 steps take 2.857 seconds\n",
      "[proc 1]sample: 0.004, forward: 0.471, backward: 0.069, update: 2.313\n",
      "[proc 0][Train](23/100000) average pos_loss: 4.892105579376221\n",
      "[proc 0][Train](23/100000) average neg_loss: 0.3143060505390167\n",
      "[proc 0][Train](23/100000) average loss: 2.603205919265747\n",
      "[proc 0][Train](23/100000) average regularization: 1.2189046856292407e-06\n",
      "[proc 0][Train] 1 steps take 3.375 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.502, backward: 0.070, update: 2.802\n",
      "[proc 1][Train](24/100000) average pos_loss: 4.692424774169922\n",
      "[proc 1][Train](24/100000) average neg_loss: 0.3152276873588562\n",
      "[proc 1][Train](24/100000) average loss: 2.503826141357422\n",
      "[proc 1][Train](24/100000) average regularization: 1.336722561973147e-06\n",
      "[proc 1][Train] 1 steps take 2.835 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.414, backward: 0.069, update: 2.350\n",
      "[proc 0][Train](24/100000) average pos_loss: 4.848245620727539\n",
      "[proc 0][Train](24/100000) average neg_loss: 0.3135529160499573\n",
      "[proc 0][Train](24/100000) average loss: 2.580899238586426\n",
      "[proc 0][Train](24/100000) average regularization: 1.3522784456654335e-06\n",
      "[proc 0][Train] 1 steps take 2.815 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.445, backward: 0.069, update: 2.298\n",
      "[proc 0][Train](25/100000) average pos_loss: 4.586786270141602\n",
      "[proc 0][Train](25/100000) average neg_loss: 0.3261682093143463\n",
      "[proc 0][Train](25/100000) average loss: 2.456477165222168\n",
      "[proc 0][Train](25/100000) average regularization: 1.254979338227713e-06\n",
      "[proc 0][Train] 1 steps take 2.568 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.413, backward: 0.070, update: 2.084\n",
      "[proc 1][Train](25/100000) average pos_loss: 4.678973197937012\n",
      "[proc 1][Train](25/100000) average neg_loss: 0.3497745096683502\n",
      "[proc 1][Train](25/100000) average loss: 2.514373779296875\n",
      "[proc 1][Train](25/100000) average regularization: 1.2307428960411926e-06\n",
      "[proc 1][Train] 1 steps take 3.125 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.769, backward: 0.069, update: 2.284\n",
      "[proc 0][Train](26/100000) average pos_loss: 4.195809364318848\n",
      "[proc 0][Train](26/100000) average neg_loss: 0.3033146858215332\n",
      "[proc 0][Train](26/100000) average loss: 2.2495620250701904\n",
      "[proc 0][Train](26/100000) average regularization: 1.4045901934878202e-06\n",
      "[proc 0][Train] 1 steps take 2.945 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.533, backward: 0.069, update: 2.341\n",
      "[proc 1][Train](26/100000) average pos_loss: 4.4696879386901855\n",
      "[proc 1][Train](26/100000) average neg_loss: 0.3089756369590759\n",
      "[proc 1][Train](26/100000) average loss: 2.389331817626953\n",
      "[proc 1][Train](26/100000) average regularization: 1.4035989579497254e-06\n",
      "[proc 1][Train] 1 steps take 2.887 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.460, backward: 0.069, update: 2.357\n",
      "[proc 0][Train](27/100000) average pos_loss: 4.269285202026367\n",
      "[proc 0][Train](27/100000) average neg_loss: 0.3621605932712555\n",
      "[proc 0][Train](27/100000) average loss: 2.315722942352295\n",
      "[proc 0][Train](27/100000) average regularization: 1.3273288459458854e-06\n",
      "[proc 0][Train] 1 steps take 2.854 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.513, backward: 0.069, update: 2.269\n",
      "[proc 1][Train](27/100000) average pos_loss: 4.121082305908203\n",
      "[proc 1][Train](27/100000) average neg_loss: 0.35940054059028625\n",
      "[proc 1][Train](27/100000) average loss: 2.240241527557373\n",
      "[proc 1][Train](27/100000) average regularization: 1.3491464869730407e-06\n",
      "[proc 1][Train] 1 steps take 2.945 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.476, backward: 0.069, update: 2.398\n",
      "[proc 0][Train](28/100000) average pos_loss: 3.8524880409240723\n",
      "[proc 0][Train](28/100000) average neg_loss: 0.3343879282474518\n",
      "[proc 0][Train](28/100000) average loss: 2.093437910079956\n",
      "[proc 0][Train](28/100000) average regularization: 1.4181988490236108e-06\n",
      "[proc 0][Train] 1 steps take 2.712 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.488, backward: 0.070, update: 2.153\n",
      "[proc 1][Train](28/100000) average pos_loss: 4.077985763549805\n",
      "[proc 1][Train](28/100000) average neg_loss: 0.3288956880569458\n",
      "[proc 1][Train](28/100000) average loss: 2.2034406661987305\n",
      "[proc 1][Train](28/100000) average regularization: 1.4208317224984057e-06\n",
      "[proc 1][Train] 1 steps take 2.759 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.458, backward: 0.073, update: 2.225\n",
      "[proc 0][Train](29/100000) average pos_loss: 3.9301321506500244\n",
      "[proc 0][Train](29/100000) average neg_loss: 0.3703135550022125\n",
      "[proc 0][Train](29/100000) average loss: 2.1502227783203125\n",
      "[proc 0][Train](29/100000) average regularization: 1.3260673767945264e-06\n",
      "[proc 0][Train] 1 steps take 2.704 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.477, backward: 0.070, update: 2.156\n",
      "[proc 1][Train](29/100000) average pos_loss: 3.9045019149780273\n",
      "[proc 1][Train](29/100000) average neg_loss: 0.3767911195755005\n",
      "[proc 1][Train](29/100000) average loss: 2.140646457672119\n",
      "[proc 1][Train](29/100000) average regularization: 1.3695663483304088e-06\n",
      "[proc 1][Train] 1 steps take 2.867 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.463, backward: 0.071, update: 2.332\n",
      "[proc 0][Train](30/100000) average pos_loss: 3.730567693710327\n",
      "[proc 0][Train](30/100000) average neg_loss: 0.35675546526908875\n",
      "[proc 0][Train](30/100000) average loss: 2.043661594390869\n",
      "[proc 0][Train](30/100000) average regularization: 1.5602010989823611e-06\n",
      "[proc 0][Train] 1 steps take 2.877 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.460, backward: 0.070, update: 2.345\n",
      "[proc 1][Train](30/100000) average pos_loss: 3.8477325439453125\n",
      "[proc 1][Train](30/100000) average neg_loss: 0.3875545263290405\n",
      "[proc 1][Train](30/100000) average loss: 2.1176435947418213\n",
      "[proc 1][Train](30/100000) average regularization: 1.4984474319135188e-06\n",
      "[proc 1][Train] 1 steps take 2.720 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.386, backward: 0.069, update: 2.263\n",
      "[proc 0][Train](31/100000) average pos_loss: 3.60250186920166\n",
      "[proc 0][Train](31/100000) average neg_loss: 0.3329826295375824\n",
      "[proc 0][Train](31/100000) average loss: 1.9677422046661377\n",
      "[proc 0][Train](31/100000) average regularization: 1.4283532436820678e-06\n",
      "[proc 0][Train] 1 steps take 2.687 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.480, backward: 0.070, update: 2.135\n",
      "[proc 1][Train](31/100000) average pos_loss: 3.7226619720458984\n",
      "[proc 1][Train](31/100000) average neg_loss: 0.3687988519668579\n",
      "[proc 1][Train](31/100000) average loss: 2.0457303524017334\n",
      "[proc 1][Train](31/100000) average regularization: 1.4159704733174294e-06\n",
      "[proc 1][Train] 1 steps take 2.817 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.464, backward: 0.069, update: 2.281\n",
      "[proc 0][Train](32/100000) average pos_loss: 3.534722328186035\n",
      "[proc 0][Train](32/100000) average neg_loss: 0.37669020891189575\n",
      "[proc 0][Train](32/100000) average loss: 1.955706238746643\n",
      "[proc 0][Train](32/100000) average regularization: 1.5880852970440174e-06\n",
      "[proc 0][Train] 1 steps take 2.929 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.451, backward: 0.069, update: 2.407\n",
      "[proc 1][Train](32/100000) average pos_loss: 3.532108783721924\n",
      "[proc 1][Train](32/100000) average neg_loss: 0.39594751596450806\n",
      "[proc 1][Train](32/100000) average loss: 1.9640281200408936\n",
      "[proc 1][Train](32/100000) average regularization: 1.6587860045547131e-06\n",
      "[proc 1][Train] 1 steps take 2.662 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.396, backward: 0.070, update: 2.195\n",
      "[proc 0][Train](33/100000) average pos_loss: 3.526782512664795\n",
      "[proc 0][Train](33/100000) average neg_loss: 0.3788360357284546\n",
      "[proc 0][Train](33/100000) average loss: 1.9528093338012695\n",
      "[proc 0][Train](33/100000) average regularization: 1.5846786709516891e-06\n",
      "[proc 0][Train] 1 steps take 2.822 seconds\n",
      "[proc 0]sample: 0.018, forward: 0.505, backward: 0.070, update: 2.229\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 1][Train](33/100000) average pos_loss: 3.417573928833008\n",
      "[proc 1][Train](33/100000) average neg_loss: 0.4230961203575134\n",
      "[proc 1][Train](33/100000) average loss: 1.920335054397583\n",
      "[proc 1][Train](33/100000) average regularization: 1.5333789633587003e-06\n",
      "[proc 1][Train] 1 steps take 2.895 seconds\n",
      "[proc 1]sample: 0.026, forward: 0.468, backward: 0.069, update: 2.331\n",
      "[proc 0][Train](34/100000) average pos_loss: 3.2738168239593506\n",
      "[proc 0][Train](34/100000) average neg_loss: 0.38912004232406616\n",
      "[proc 0][Train](34/100000) average loss: 1.8314684629440308\n",
      "[proc 0][Train](34/100000) average regularization: 1.713571236905409e-06\n",
      "[proc 0][Train] 1 steps take 2.820 seconds\n",
      "[proc 0]sample: 0.017, forward: 0.486, backward: 0.069, update: 2.248\n",
      "[proc 1][Train](34/100000) average pos_loss: 3.443295478820801\n",
      "[proc 1][Train](34/100000) average neg_loss: 0.3749896287918091\n",
      "[proc 1][Train](34/100000) average loss: 1.9091424942016602\n",
      "[proc 1][Train](34/100000) average regularization: 1.7840230839283322e-06\n",
      "[proc 1][Train] 1 steps take 2.694 seconds\n",
      "[proc 1]sample: 0.015, forward: 0.445, backward: 0.070, update: 2.164\n",
      "[proc 1][Train](35/100000) average pos_loss: 3.2246508598327637\n",
      "[proc 1][Train](35/100000) average neg_loss: 0.33386319875717163\n",
      "[proc 1][Train](35/100000) average loss: 1.77925705909729\n",
      "[proc 1][Train](35/100000) average regularization: 1.5783994058438111e-06\n",
      "[proc 1][Train] 1 steps take 2.854 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.476, backward: 0.069, update: 2.306\n",
      "[proc 0][Train](35/100000) average pos_loss: 3.214991807937622\n",
      "[proc 0][Train](35/100000) average neg_loss: 0.4006887972354889\n",
      "[proc 0][Train](35/100000) average loss: 1.807840347290039\n",
      "[proc 0][Train](35/100000) average regularization: 1.5686426877437043e-06\n",
      "[proc 0][Train] 1 steps take 2.943 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.536, backward: 0.070, update: 2.336\n",
      "[proc 1][Train](36/100000) average pos_loss: 3.142911911010742\n",
      "[proc 1][Train](36/100000) average neg_loss: 0.3930503726005554\n",
      "[proc 1][Train](36/100000) average loss: 1.7679811716079712\n",
      "[proc 1][Train](36/100000) average regularization: 1.7426481235816027e-06\n",
      "[proc 1][Train] 1 steps take 2.848 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.484, backward: 0.070, update: 2.292\n",
      "[proc 0][Train](36/100000) average pos_loss: 3.1461973190307617\n",
      "[proc 0][Train](36/100000) average neg_loss: 0.420586496591568\n",
      "[proc 0][Train](36/100000) average loss: 1.7833919525146484\n",
      "[proc 0][Train](36/100000) average regularization: 1.7297689964834717e-06\n",
      "[proc 0][Train] 1 steps take 2.800 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.469, backward: 0.070, update: 2.259\n",
      "[proc 0][Train](37/100000) average pos_loss: 2.9244680404663086\n",
      "[proc 0][Train](37/100000) average neg_loss: 0.4233601689338684\n",
      "[proc 0][Train](37/100000) average loss: 1.6739140748977661\n",
      "[proc 0][Train](37/100000) average regularization: 1.6494102510478115e-06\n",
      "[proc 0][Train] 1 steps take 2.720 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.474, backward: 0.069, update: 2.175\n",
      "[proc 1][Train](37/100000) average pos_loss: 3.027541160583496\n",
      "[proc 1][Train](37/100000) average neg_loss: 0.4043218493461609\n",
      "[proc 1][Train](37/100000) average loss: 1.7159315347671509\n",
      "[proc 1][Train](37/100000) average regularization: 1.6556398350076051e-06\n",
      "[proc 1][Train] 1 steps take 2.913 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.514, backward: 0.069, update: 2.329\n",
      "[proc 0][Train](38/100000) average pos_loss: 3.0486910343170166\n",
      "[proc 0][Train](38/100000) average neg_loss: 0.40251797437667847\n",
      "[proc 0][Train](38/100000) average loss: 1.72560453414917\n",
      "[proc 0][Train](38/100000) average regularization: 1.9933197563659633e-06\n",
      "[proc 0][Train] 1 steps take 2.803 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.506, backward: 0.069, update: 2.226\n",
      "[proc 1][Train](38/100000) average pos_loss: 3.0689215660095215\n",
      "[proc 1][Train](38/100000) average neg_loss: 0.41227614879608154\n",
      "[proc 1][Train](38/100000) average loss: 1.7405989170074463\n",
      "[proc 1][Train](38/100000) average regularization: 1.8920680986411753e-06\n",
      "[proc 1][Train] 1 steps take 2.811 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.457, backward: 0.069, update: 2.284\n",
      "[proc 0][Train](39/100000) average pos_loss: 2.9526889324188232\n",
      "[proc 0][Train](39/100000) average neg_loss: 0.40299350023269653\n",
      "[proc 0][Train](39/100000) average loss: 1.6778411865234375\n",
      "[proc 0][Train](39/100000) average regularization: 1.7515200170237222e-06\n",
      "[proc 0][Train] 1 steps take 2.998 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.505, backward: 0.069, update: 2.422\n",
      "[proc 1][Train](39/100000) average pos_loss: 2.9685628414154053\n",
      "[proc 1][Train](39/100000) average neg_loss: 0.4140585958957672\n",
      "[proc 1][Train](39/100000) average loss: 1.6913107633590698\n",
      "[proc 1][Train](39/100000) average regularization: 1.7102667015933548e-06\n",
      "[proc 1][Train] 1 steps take 2.912 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.514, backward: 0.070, update: 2.326\n",
      "[proc 0][Train](40/100000) average pos_loss: 2.9196016788482666\n",
      "[proc 0][Train](40/100000) average neg_loss: 0.42158764600753784\n",
      "[proc 0][Train](40/100000) average loss: 1.6705946922302246\n",
      "[proc 0][Train](40/100000) average regularization: 1.8691586092245416e-06\n",
      "[proc 0][Train] 1 steps take 2.821 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.501, backward: 0.070, update: 2.248\n",
      "[proc 1][Train](40/100000) average pos_loss: 2.856916666030884\n",
      "[proc 1][Train](40/100000) average neg_loss: 0.4069274663925171\n",
      "[proc 1][Train](40/100000) average loss: 1.6319220066070557\n",
      "[proc 1][Train](40/100000) average regularization: 1.9501280803524423e-06\n",
      "[proc 1][Train] 1 steps take 2.924 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.473, backward: 0.069, update: 2.379\n",
      "[proc 0][Train](41/100000) average pos_loss: 2.7078309059143066\n",
      "[proc 0][Train](41/100000) average neg_loss: 0.4353429079055786\n",
      "[proc 0][Train](41/100000) average loss: 1.5715868473052979\n",
      "[proc 0][Train](41/100000) average regularization: 1.7975992250285344e-06\n",
      "[proc 0][Train] 1 steps take 2.735 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.521, backward: 0.071, update: 2.142\n",
      "[proc 1][Train](41/100000) average pos_loss: 2.790473461151123\n",
      "[proc 1][Train](41/100000) average neg_loss: 0.40126967430114746\n",
      "[proc 1][Train](41/100000) average loss: 1.5958715677261353\n",
      "[proc 1][Train](41/100000) average regularization: 1.865595322669833e-06\n",
      "[proc 1][Train] 1 steps take 2.806 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.462, backward: 0.069, update: 2.273\n",
      "[proc 0][Train](42/100000) average pos_loss: 2.619683265686035\n",
      "[proc 0][Train](42/100000) average neg_loss: 0.41387125849723816\n",
      "[proc 0][Train](42/100000) average loss: 1.5167772769927979\n",
      "[proc 0][Train](42/100000) average regularization: 1.96291057363851e-06\n",
      "[proc 0][Train] 1 steps take 2.703 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.469, backward: 0.072, update: 2.161\n",
      "[proc 1][Train](42/100000) average pos_loss: 2.7359602451324463\n",
      "[proc 1][Train](42/100000) average neg_loss: 0.43309468030929565\n",
      "[proc 1][Train](42/100000) average loss: 1.5845274925231934\n",
      "[proc 1][Train](42/100000) average regularization: 2.0090255929972045e-06\n",
      "[proc 1][Train] 1 steps take 2.956 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.451, backward: 0.071, update: 2.432\n",
      "[proc 0][Train](43/100000) average pos_loss: 2.6931235790252686\n",
      "[proc 0][Train](43/100000) average neg_loss: 0.40353894233703613\n",
      "[proc 0][Train](43/100000) average loss: 1.5483312606811523\n",
      "[proc 0][Train](43/100000) average regularization: 1.8452267340762774e-06\n",
      "[proc 0][Train] 1 steps take 2.714 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.426, backward: 0.070, update: 2.215\n",
      "[proc 1][Train](43/100000) average pos_loss: 2.6525983810424805\n",
      "[proc 1][Train](43/100000) average neg_loss: 0.389854371547699\n",
      "[proc 1][Train](43/100000) average loss: 1.521226406097412\n",
      "[proc 1][Train](43/100000) average regularization: 1.9530132249201415e-06\n",
      "[proc 1][Train] 1 steps take 2.682 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.434, backward: 0.069, update: 2.177\n",
      "[proc 0][Train](44/100000) average pos_loss: 2.650681257247925\n",
      "[proc 0][Train](44/100000) average neg_loss: 0.41927918791770935\n",
      "[proc 0][Train](44/100000) average loss: 1.5349801778793335\n",
      "[proc 0][Train](44/100000) average regularization: 2.01351963369234e-06\n",
      "[proc 0][Train] 1 steps take 2.695 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.439, backward: 0.071, update: 2.184\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 1][Train](44/100000) average pos_loss: 2.655526638031006\n",
      "[proc 1][Train](44/100000) average neg_loss: 0.4615135192871094\n",
      "[proc 1][Train](44/100000) average loss: 1.5585200786590576\n",
      "[proc 1][Train](44/100000) average regularization: 2.0203976873744978e-06\n",
      "[proc 1][Train] 1 steps take 2.803 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.400, backward: 0.070, update: 2.331\n",
      "[proc 0][Train](45/100000) average pos_loss: 2.6914446353912354\n",
      "[proc 0][Train](45/100000) average neg_loss: 0.4003305435180664\n",
      "[proc 0][Train](45/100000) average loss: 1.5458875894546509\n",
      "[proc 0][Train](45/100000) average regularization: 1.942932158272015e-06\n",
      "[proc 0][Train] 1 steps take 2.567 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.420, backward: 0.071, update: 2.074\n",
      "[proc 1][Train](45/100000) average pos_loss: 2.4930062294006348\n",
      "[proc 1][Train](45/100000) average neg_loss: 0.38489073514938354\n",
      "[proc 1][Train](45/100000) average loss: 1.4389485120773315\n",
      "[proc 1][Train](45/100000) average regularization: 2.1035025383753236e-06\n",
      "[proc 1][Train] 1 steps take 2.791 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.430, backward: 0.072, update: 2.288\n",
      "[proc 0][Train](46/100000) average pos_loss: 2.4823713302612305\n",
      "[proc 0][Train](46/100000) average neg_loss: 0.456572949886322\n",
      "[proc 0][Train](46/100000) average loss: 1.4694721698760986\n",
      "[proc 0][Train](46/100000) average regularization: 2.062667590507772e-06\n",
      "[proc 0][Train] 1 steps take 2.663 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.439, backward: 0.070, update: 2.152\n",
      "[proc 1][Train](46/100000) average pos_loss: 2.556630849838257\n",
      "[proc 1][Train](46/100000) average neg_loss: 0.4355023205280304\n",
      "[proc 1][Train](46/100000) average loss: 1.4960665702819824\n",
      "[proc 1][Train](46/100000) average regularization: 2.024032255576458e-06\n",
      "[proc 1][Train] 1 steps take 2.700 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.438, backward: 0.070, update: 2.190\n",
      "[proc 0][Train](47/100000) average pos_loss: 2.3943123817443848\n",
      "[proc 0][Train](47/100000) average neg_loss: 0.41403305530548096\n",
      "[proc 0][Train](47/100000) average loss: 1.404172658920288\n",
      "[proc 0][Train](47/100000) average regularization: 2.0930174287059344e-06\n",
      "[proc 0][Train] 1 steps take 2.737 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.437, backward: 0.069, update: 2.229\n",
      "[proc 1][Train](47/100000) average pos_loss: 2.4171347618103027\n",
      "[proc 1][Train](47/100000) average neg_loss: 0.3498193025588989\n",
      "[proc 1][Train](47/100000) average loss: 1.383476972579956\n",
      "[proc 1][Train](47/100000) average regularization: 2.121592842740938e-06\n",
      "[proc 1][Train] 1 steps take 2.712 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.427, backward: 0.069, update: 2.214\n",
      "[proc 0][Train](48/100000) average pos_loss: 2.379564046859741\n",
      "[proc 0][Train](48/100000) average neg_loss: 0.5012319087982178\n",
      "[proc 0][Train](48/100000) average loss: 1.4403979778289795\n",
      "[proc 0][Train](48/100000) average regularization: 2.085346523017506e-06\n",
      "[proc 0][Train] 1 steps take 2.696 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.438, backward: 0.070, update: 2.186\n",
      "[proc 1][Train](48/100000) average pos_loss: 2.4036576747894287\n",
      "[proc 1][Train](48/100000) average neg_loss: 0.44189542531967163\n",
      "[proc 1][Train](48/100000) average loss: 1.4227765798568726\n",
      "[proc 1][Train](48/100000) average regularization: 2.1084067611809587e-06\n",
      "[proc 1][Train] 1 steps take 2.670 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.435, backward: 0.070, update: 2.163\n",
      "[proc 0][Train](49/100000) average pos_loss: 2.3394289016723633\n",
      "[proc 0][Train](49/100000) average neg_loss: 0.42961978912353516\n",
      "[proc 0][Train](49/100000) average loss: 1.3845243453979492\n",
      "[proc 0][Train](49/100000) average regularization: 2.091670012305258e-06\n",
      "[proc 0][Train] 1 steps take 2.674 seconds\n",
      "[proc 0]sample: 0.015, forward: 0.434, backward: 0.069, update: 2.155\n",
      "[proc 1][Train](49/100000) average pos_loss: 2.3347504138946533\n",
      "[proc 1][Train](49/100000) average neg_loss: 0.38427674770355225\n",
      "[proc 1][Train](49/100000) average loss: 1.359513521194458\n",
      "[proc 1][Train](49/100000) average regularization: 2.199512891820632e-06\n",
      "[proc 1][Train] 1 steps take 3.474 seconds\n",
      "[proc 1]sample: 0.018, forward: 0.426, backward: 0.070, update: 2.961\n",
      "[proc 0][Train](50/100000) average pos_loss: 2.230588912963867\n",
      "[proc 0][Train](50/100000) average neg_loss: 0.46879640221595764\n",
      "[proc 0][Train](50/100000) average loss: 1.349692702293396\n",
      "[proc 0][Train](50/100000) average regularization: 2.125424316545832e-06\n",
      "[proc 0][Train] 1 steps take 2.810 seconds\n",
      "[proc 0]sample: 0.014, forward: 0.434, backward: 0.069, update: 2.293\n",
      "[proc 1][Train](50/100000) average pos_loss: 2.2566275596618652\n",
      "[proc 1][Train](50/100000) average neg_loss: 0.479818195104599\n",
      "[proc 1][Train](50/100000) average loss: 1.3682228326797485\n",
      "[proc 1][Train](50/100000) average regularization: 2.201609049734543e-06\n",
      "[proc 1][Train] 1 steps take 2.968 seconds\n",
      "[proc 1]sample: 0.020, forward: 0.439, backward: 0.070, update: 2.439\n",
      "[proc 0][Train](51/100000) average pos_loss: 2.2374300956726074\n",
      "[proc 0][Train](51/100000) average neg_loss: 0.4029307961463928\n",
      "[proc 0][Train](51/100000) average loss: 1.3201804161071777\n",
      "[proc 0][Train](51/100000) average regularization: 2.1456480681081302e-06\n",
      "[proc 0][Train] 1 steps take 2.743 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.429, backward: 0.070, update: 2.242\n",
      "[proc 1][Train](51/100000) average pos_loss: 2.267165184020996\n",
      "[proc 1][Train](51/100000) average neg_loss: 0.38414159417152405\n",
      "[proc 1][Train](51/100000) average loss: 1.3256534337997437\n",
      "[proc 1][Train](51/100000) average regularization: 2.3134728053264553e-06\n",
      "[proc 1][Train] 1 steps take 2.773 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.449, backward: 0.070, update: 2.251\n",
      "[proc 0][Train](52/100000) average pos_loss: 2.181267023086548\n",
      "[proc 0][Train](52/100000) average neg_loss: 0.4869791865348816\n",
      "[proc 0][Train](52/100000) average loss: 1.334123134613037\n",
      "[proc 0][Train](52/100000) average regularization: 2.3682434857619228e-06\n",
      "[proc 0][Train] 1 steps take 2.729 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.489, backward: 0.070, update: 2.168\n",
      "[proc 1][Train](52/100000) average pos_loss: 2.209280490875244\n",
      "[proc 1][Train](52/100000) average neg_loss: 0.47126850485801697\n",
      "[proc 1][Train](52/100000) average loss: 1.340274453163147\n",
      "[proc 1][Train](52/100000) average regularization: 2.2149029064166825e-06\n",
      "[proc 1][Train] 1 steps take 2.782 seconds\n",
      "[proc 1]sample: 0.003, forward: 0.440, backward: 0.070, update: 2.270\n",
      "[proc 0][Train](53/100000) average pos_loss: 2.1085689067840576\n",
      "[proc 0][Train](53/100000) average neg_loss: 0.40062353014945984\n",
      "[proc 0][Train](53/100000) average loss: 1.25459623336792\n",
      "[proc 0][Train](53/100000) average regularization: 2.27030977839604e-06\n",
      "[proc 0][Train] 1 steps take 2.808 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.434, backward: 0.071, update: 2.302\n",
      "[proc 1][Train](53/100000) average pos_loss: 2.1856675148010254\n",
      "[proc 1][Train](53/100000) average neg_loss: 0.37546345591545105\n",
      "[proc 1][Train](53/100000) average loss: 1.2805655002593994\n",
      "[proc 1][Train](53/100000) average regularization: 2.2769434053770965e-06\n",
      "[proc 1][Train] 1 steps take 2.772 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.450, backward: 0.069, update: 2.251\n",
      "[proc 0][Train](54/100000) average pos_loss: 2.18509578704834\n",
      "[proc 0][Train](54/100000) average neg_loss: 0.4944501519203186\n",
      "[proc 0][Train](54/100000) average loss: 1.3397729396820068\n",
      "[proc 0][Train](54/100000) average regularization: 2.2559902390639763e-06\n",
      "[proc 0][Train] 1 steps take 2.743 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.426, backward: 0.070, update: 2.245\n",
      "[proc 1][Train](54/100000) average pos_loss: 2.0433707237243652\n",
      "[proc 1][Train](54/100000) average neg_loss: 0.4994005858898163\n",
      "[proc 1][Train](54/100000) average loss: 1.271385669708252\n",
      "[proc 1][Train](54/100000) average regularization: 2.2705551145918434e-06\n",
      "[proc 1][Train] 1 steps take 2.651 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.442, backward: 0.069, update: 2.139\n",
      "[proc 0][Train](55/100000) average pos_loss: 2.1306304931640625\n",
      "[proc 0][Train](55/100000) average neg_loss: 0.3889315724372864\n",
      "[proc 0][Train](55/100000) average loss: 1.259781002998352\n",
      "[proc 0][Train](55/100000) average regularization: 2.2410577003029175e-06\n",
      "[proc 0][Train] 1 steps take 2.739 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.429, backward: 0.070, update: 2.238\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 1][Train](55/100000) average pos_loss: 2.03725004196167\n",
      "[proc 1][Train](55/100000) average neg_loss: 0.38079530000686646\n",
      "[proc 1][Train](55/100000) average loss: 1.2090226411819458\n",
      "[proc 1][Train](55/100000) average regularization: 2.316233121746336e-06\n",
      "[proc 1][Train] 1 steps take 2.814 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.458, backward: 0.070, update: 2.284\n",
      "[proc 0][Train](56/100000) average pos_loss: 2.102231979370117\n",
      "[proc 0][Train](56/100000) average neg_loss: 0.44959554076194763\n",
      "[proc 0][Train](56/100000) average loss: 1.2759137153625488\n",
      "[proc 0][Train](56/100000) average regularization: 2.4225089418905554e-06\n",
      "[proc 0][Train] 1 steps take 2.807 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.435, backward: 0.070, update: 2.301\n",
      "[proc 1][Train](56/100000) average pos_loss: 2.027817726135254\n",
      "[proc 1][Train](56/100000) average neg_loss: 0.49231860041618347\n",
      "[proc 1][Train](56/100000) average loss: 1.2600681781768799\n",
      "[proc 1][Train](56/100000) average regularization: 2.4543960535083897e-06\n",
      "[proc 1][Train] 1 steps take 2.758 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.453, backward: 0.070, update: 2.234\n",
      "[proc 0][Train](57/100000) average pos_loss: 2.0556068420410156\n",
      "[proc 0][Train](57/100000) average neg_loss: 0.38869577646255493\n",
      "[proc 0][Train](57/100000) average loss: 1.222151279449463\n",
      "[proc 0][Train](57/100000) average regularization: 2.3610932657902595e-06\n",
      "[proc 0][Train] 1 steps take 3.672 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.434, backward: 0.070, update: 3.167\n",
      "[proc 1][Train](57/100000) average pos_loss: 1.9715704917907715\n",
      "[proc 1][Train](57/100000) average neg_loss: 0.37870949506759644\n",
      "[proc 1][Train](57/100000) average loss: 1.1751400232315063\n",
      "[proc 1][Train](57/100000) average regularization: 2.3778516151651274e-06\n",
      "[proc 1][Train] 1 steps take 2.631 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.397, backward: 0.070, update: 2.163\n",
      "[proc 0][Train](58/100000) average pos_loss: 1.976449966430664\n",
      "[proc 0][Train](58/100000) average neg_loss: 0.4776586592197418\n",
      "[proc 0][Train](58/100000) average loss: 1.2270543575286865\n",
      "[proc 0][Train](58/100000) average regularization: 2.4697937988094054e-06\n",
      "[proc 0][Train] 1 steps take 2.789 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.473, backward: 0.071, update: 2.242\n",
      "[proc 1][Train](58/100000) average pos_loss: 1.9438340663909912\n",
      "[proc 1][Train](58/100000) average neg_loss: 0.4883500933647156\n",
      "[proc 1][Train](58/100000) average loss: 1.2160921096801758\n",
      "[proc 1][Train](58/100000) average regularization: 2.353859827053384e-06\n",
      "[proc 1][Train] 1 steps take 2.718 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.398, backward: 0.069, update: 2.249\n",
      "[proc 0][Train](59/100000) average pos_loss: 1.9666061401367188\n",
      "[proc 0][Train](59/100000) average neg_loss: 0.41153135895729065\n",
      "[proc 0][Train](59/100000) average loss: 1.1890687942504883\n",
      "[proc 0][Train](59/100000) average regularization: 2.3172265173343476e-06\n",
      "[proc 0][Train] 1 steps take 3.186 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.720, backward: 0.069, update: 2.394\n",
      "[proc 1][Train](59/100000) average pos_loss: 1.960590124130249\n",
      "[proc 1][Train](59/100000) average neg_loss: 0.36443373560905457\n",
      "[proc 1][Train](59/100000) average loss: 1.162511944770813\n",
      "[proc 1][Train](59/100000) average regularization: 2.4079142804112053e-06\n",
      "[proc 1][Train] 1 steps take 2.785 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.422, backward: 0.069, update: 2.292\n",
      "[proc 0][Train](60/100000) average pos_loss: 1.9310784339904785\n",
      "[proc 0][Train](60/100000) average neg_loss: 0.4812372922897339\n",
      "[proc 0][Train](60/100000) average loss: 1.206157922744751\n",
      "[proc 0][Train](60/100000) average regularization: 2.4980020043585682e-06\n",
      "[proc 0][Train] 1 steps take 2.927 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.492, backward: 0.070, update: 2.363\n",
      "[proc 1][Train](60/100000) average pos_loss: 1.8192756175994873\n",
      "[proc 1][Train](60/100000) average neg_loss: 0.4813934564590454\n",
      "[proc 1][Train](60/100000) average loss: 1.1503345966339111\n",
      "[proc 1][Train](60/100000) average regularization: 2.469800847393344e-06\n",
      "[proc 1][Train] 1 steps take 2.776 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.428, backward: 0.070, update: 2.277\n",
      "[proc 0][Train](61/100000) average pos_loss: 1.893917202949524\n",
      "[proc 0][Train](61/100000) average neg_loss: 0.39049020409584045\n",
      "[proc 0][Train](61/100000) average loss: 1.142203688621521\n",
      "[proc 0][Train](61/100000) average regularization: 2.463853661538451e-06\n",
      "[proc 0][Train] 1 steps take 2.866 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.505, backward: 0.070, update: 2.289\n",
      "[proc 1][Train](61/100000) average pos_loss: 1.7718231678009033\n",
      "[proc 1][Train](61/100000) average neg_loss: 0.4125521183013916\n",
      "[proc 1][Train](61/100000) average loss: 1.0921876430511475\n",
      "[proc 1][Train](61/100000) average regularization: 2.5167007606796687e-06\n",
      "[proc 1][Train] 1 steps take 2.917 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.477, backward: 0.070, update: 2.368\n",
      "[proc 1][Train](62/100000) average pos_loss: 1.9408866167068481\n",
      "[proc 1][Train](62/100000) average neg_loss: 0.4602909982204437\n",
      "[proc 1][Train](62/100000) average loss: 1.2005888223648071\n",
      "[proc 1][Train](62/100000) average regularization: 2.633112671901472e-06\n",
      "[proc 1][Train] 1 steps take 2.799 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.448, backward: 0.070, update: 2.279\n",
      "[proc 0][Train](62/100000) average pos_loss: 1.8854255676269531\n",
      "[proc 0][Train](62/100000) average neg_loss: 0.46018391847610474\n",
      "[proc 0][Train](62/100000) average loss: 1.1728047132492065\n",
      "[proc 0][Train](62/100000) average regularization: 2.6121495011466322e-06\n",
      "[proc 0][Train] 1 steps take 3.005 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.491, backward: 0.069, update: 2.443\n",
      "[proc 1][Train](63/100000) average pos_loss: 1.9719852209091187\n",
      "[proc 1][Train](63/100000) average neg_loss: 0.40290597081184387\n",
      "[proc 1][Train](63/100000) average loss: 1.1874456405639648\n",
      "[proc 1][Train](63/100000) average regularization: 2.408233740425203e-06\n",
      "[proc 1][Train] 1 steps take 2.627 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.469, backward: 0.069, update: 2.087\n",
      "[proc 0][Train](63/100000) average pos_loss: 1.9206318855285645\n",
      "[proc 0][Train](63/100000) average neg_loss: 0.39342477917671204\n",
      "[proc 0][Train](63/100000) average loss: 1.157028317451477\n",
      "[proc 0][Train](63/100000) average regularization: 2.5462059056735598e-06\n",
      "[proc 0][Train] 1 steps take 2.717 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.458, backward: 0.070, update: 2.187\n",
      "[proc 1][Train](64/100000) average pos_loss: 1.7282285690307617\n",
      "[proc 1][Train](64/100000) average neg_loss: 0.5222535729408264\n",
      "[proc 1][Train](64/100000) average loss: 1.1252410411834717\n",
      "[proc 1][Train](64/100000) average regularization: 2.700132881727768e-06\n",
      "[proc 1][Train] 1 steps take 2.819 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.514, backward: 0.069, update: 2.235\n",
      "[proc 0][Train](64/100000) average pos_loss: 1.839901328086853\n",
      "[proc 0][Train](64/100000) average neg_loss: 0.47532063722610474\n",
      "[proc 0][Train](64/100000) average loss: 1.1576110124588013\n",
      "[proc 0][Train](64/100000) average regularization: 2.53795974458626e-06\n",
      "[proc 0][Train] 1 steps take 2.809 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.472, backward: 0.070, update: 2.265\n",
      "[proc 1][Train](65/100000) average pos_loss: 1.6964588165283203\n",
      "[proc 1][Train](65/100000) average neg_loss: 0.42038464546203613\n",
      "[proc 1][Train](65/100000) average loss: 1.0584217309951782\n",
      "[proc 1][Train](65/100000) average regularization: 2.578163275757106e-06\n",
      "[proc 1][Train] 1 steps take 2.755 seconds\n",
      "[proc 1]sample: 0.017, forward: 0.501, backward: 0.069, update: 2.168\n",
      "[proc 0][Train](65/100000) average pos_loss: 1.7917066812515259\n",
      "[proc 0][Train](65/100000) average neg_loss: 0.3682686686515808\n",
      "[proc 0][Train](65/100000) average loss: 1.079987645149231\n",
      "[proc 0][Train](65/100000) average regularization: 2.6203329070995096e-06\n",
      "[proc 0][Train] 1 steps take 2.962 seconds\n",
      "[proc 0]sample: 0.020, forward: 0.458, backward: 0.069, update: 2.414\n",
      "[proc 1][Train](66/100000) average pos_loss: 1.7241164445877075\n",
      "[proc 1][Train](66/100000) average neg_loss: 0.4958232641220093\n",
      "[proc 1][Train](66/100000) average loss: 1.1099698543548584\n",
      "[proc 1][Train](66/100000) average regularization: 2.603863094918779e-06\n",
      "[proc 1][Train] 1 steps take 2.709 seconds\n",
      "[proc 1]sample: 0.017, forward: 0.466, backward: 0.069, update: 2.156\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](66/100000) average pos_loss: 1.8157953023910522\n",
      "[proc 0][Train](66/100000) average neg_loss: 0.5005790591239929\n",
      "[proc 0][Train](66/100000) average loss: 1.1581871509552002\n",
      "[proc 0][Train](66/100000) average regularization: 2.588224106148118e-06\n",
      "[proc 0][Train] 1 steps take 2.735 seconds\n",
      "[proc 0]sample: 0.017, forward: 0.431, backward: 0.069, update: 2.217\n",
      "[proc 1][Train](67/100000) average pos_loss: 1.7446918487548828\n",
      "[proc 1][Train](67/100000) average neg_loss: 0.4065309762954712\n",
      "[proc 1][Train](67/100000) average loss: 1.0756113529205322\n",
      "[proc 1][Train](67/100000) average regularization: 2.6292200345778838e-06\n",
      "[proc 1][Train] 1 steps take 2.674 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.427, backward: 0.071, update: 2.174\n",
      "[proc 0][Train](67/100000) average pos_loss: 1.7378671169281006\n",
      "[proc 0][Train](67/100000) average neg_loss: 0.39394572377204895\n",
      "[proc 0][Train](67/100000) average loss: 1.0659064054489136\n",
      "[proc 0][Train](67/100000) average regularization: 2.7288804176350823e-06\n",
      "[proc 0][Train] 1 steps take 2.769 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.431, backward: 0.070, update: 2.266\n",
      "[proc 1][Train](68/100000) average pos_loss: 1.7858905792236328\n",
      "[proc 1][Train](68/100000) average neg_loss: 0.45821040868759155\n",
      "[proc 1][Train](68/100000) average loss: 1.1220505237579346\n",
      "[proc 1][Train](68/100000) average regularization: 2.6870472993323347e-06\n",
      "[proc 1][Train] 1 steps take 2.634 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.436, backward: 0.071, update: 2.126\n",
      "[proc 0][Train](68/100000) average pos_loss: 1.7118947505950928\n",
      "[proc 0][Train](68/100000) average neg_loss: 0.4960744380950928\n",
      "[proc 0][Train](68/100000) average loss: 1.1039845943450928\n",
      "[proc 0][Train](68/100000) average regularization: 2.6856862405111315e-06\n",
      "[proc 0][Train] 1 steps take 2.749 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.442, backward: 0.071, update: 2.234\n",
      "[proc 1][Train](69/100000) average pos_loss: 1.7570269107818604\n",
      "[proc 1][Train](69/100000) average neg_loss: 0.4169390797615051\n",
      "[proc 1][Train](69/100000) average loss: 1.0869829654693604\n",
      "[proc 1][Train](69/100000) average regularization: 2.572536232037237e-06\n",
      "[proc 1][Train] 1 steps take 2.631 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.429, backward: 0.070, update: 2.130\n",
      "[proc 0][Train](69/100000) average pos_loss: 1.723585605621338\n",
      "[proc 0][Train](69/100000) average neg_loss: 0.42622992396354675\n",
      "[proc 0][Train](69/100000) average loss: 1.0749077796936035\n",
      "[proc 0][Train](69/100000) average regularization: 2.719418262131512e-06\n",
      "[proc 0][Train] 1 steps take 2.667 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.442, backward: 0.070, update: 2.153\n",
      "[proc 1][Train](70/100000) average pos_loss: 1.711012601852417\n",
      "[proc 1][Train](70/100000) average neg_loss: 0.4958553910255432\n",
      "[proc 1][Train](70/100000) average loss: 1.1034339666366577\n",
      "[proc 1][Train](70/100000) average regularization: 2.870712251024088e-06\n",
      "[proc 1][Train] 1 steps take 2.677 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.430, backward: 0.070, update: 2.177\n",
      "[proc 0][Train](70/100000) average pos_loss: 1.6435678005218506\n",
      "[proc 0][Train](70/100000) average neg_loss: 0.49633079767227173\n",
      "[proc 0][Train](70/100000) average loss: 1.0699492692947388\n",
      "[proc 0][Train](70/100000) average regularization: 2.791241513477871e-06\n",
      "[proc 0][Train] 1 steps take 2.671 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.439, backward: 0.070, update: 2.160\n",
      "[proc 1][Train](71/100000) average pos_loss: 1.7793173789978027\n",
      "[proc 1][Train](71/100000) average neg_loss: 0.38889944553375244\n",
      "[proc 1][Train](71/100000) average loss: 1.0841083526611328\n",
      "[proc 1][Train](71/100000) average regularization: 2.726394313867786e-06\n",
      "[proc 1][Train] 1 steps take 2.658 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.416, backward: 0.070, update: 2.170\n",
      "[proc 0][Train](71/100000) average pos_loss: 1.643536925315857\n",
      "[proc 0][Train](71/100000) average neg_loss: 0.35480403900146484\n",
      "[proc 0][Train](71/100000) average loss: 0.9991704821586609\n",
      "[proc 0][Train](71/100000) average regularization: 2.770296987364418e-06\n",
      "[proc 0][Train] 1 steps take 2.760 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.537, backward: 0.070, update: 2.151\n",
      "[proc 1][Train](72/100000) average pos_loss: 1.6636865139007568\n",
      "[proc 1][Train](72/100000) average neg_loss: 0.5097805857658386\n",
      "[proc 1][Train](72/100000) average loss: 1.0867335796356201\n",
      "[proc 1][Train](72/100000) average regularization: 2.7878386390511878e-06\n",
      "[proc 1][Train] 1 steps take 2.555 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.422, backward: 0.070, update: 2.061\n",
      "[proc 0][Train](72/100000) average pos_loss: 1.632459044456482\n",
      "[proc 0][Train](72/100000) average neg_loss: 0.49384382367134094\n",
      "[proc 0][Train](72/100000) average loss: 1.063151478767395\n",
      "[proc 0][Train](72/100000) average regularization: 2.714354423005716e-06\n",
      "[proc 0][Train] 1 steps take 2.621 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.408, backward: 0.070, update: 2.142\n",
      "[proc 1][Train](73/100000) average pos_loss: 1.6560289859771729\n",
      "[proc 1][Train](73/100000) average neg_loss: 0.37254583835601807\n",
      "[proc 1][Train](73/100000) average loss: 1.0142874717712402\n",
      "[proc 1][Train](73/100000) average regularization: 2.745155825323309e-06\n",
      "[proc 1][Train] 1 steps take 2.631 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.416, backward: 0.070, update: 2.143\n",
      "[proc 0][Train](73/100000) average pos_loss: 1.5671361684799194\n",
      "[proc 0][Train](73/100000) average neg_loss: 0.37445127964019775\n",
      "[proc 0][Train](73/100000) average loss: 0.9707937240600586\n",
      "[proc 0][Train](73/100000) average regularization: 2.861267148546176e-06\n",
      "[proc 0][Train] 1 steps take 2.743 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.411, backward: 0.071, update: 2.260\n",
      "[proc 1][Train](74/100000) average pos_loss: 1.5673421621322632\n",
      "[proc 1][Train](74/100000) average neg_loss: 0.5098726749420166\n",
      "[proc 1][Train](74/100000) average loss: 1.0386073589324951\n",
      "[proc 1][Train](74/100000) average regularization: 2.903562744904775e-06\n",
      "[proc 1][Train] 1 steps take 2.713 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.430, backward: 0.070, update: 2.213\n",
      "[proc 0][Train](74/100000) average pos_loss: 1.598771333694458\n",
      "[proc 0][Train](74/100000) average neg_loss: 0.4992148280143738\n",
      "[proc 0][Train](74/100000) average loss: 1.0489931106567383\n",
      "[proc 0][Train](74/100000) average regularization: 2.9713285130128497e-06\n",
      "[proc 0][Train] 1 steps take 2.800 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.469, backward: 0.070, update: 2.258\n",
      "[proc 1][Train](75/100000) average pos_loss: 1.563098430633545\n",
      "[proc 1][Train](75/100000) average neg_loss: 0.3826388418674469\n",
      "[proc 1][Train](75/100000) average loss: 0.9728686213493347\n",
      "[proc 1][Train](75/100000) average regularization: 2.7214910005568527e-06\n",
      "[proc 1][Train] 1 steps take 2.958 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.448, backward: 0.070, update: 2.438\n",
      "[proc 0][Train](75/100000) average pos_loss: 1.5717657804489136\n",
      "[proc 0][Train](75/100000) average neg_loss: 0.34746313095092773\n",
      "[proc 0][Train](75/100000) average loss: 0.9596144556999207\n",
      "[proc 0][Train](75/100000) average regularization: 2.9477887437678874e-06\n",
      "[proc 0][Train] 1 steps take 2.859 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.437, backward: 0.070, update: 2.349\n",
      "[proc 1][Train](76/100000) average pos_loss: 1.5144875049591064\n",
      "[proc 1][Train](76/100000) average neg_loss: 0.532753050327301\n",
      "[proc 1][Train](76/100000) average loss: 1.0236202478408813\n",
      "[proc 1][Train](76/100000) average regularization: 3.0776193398196483e-06\n",
      "[proc 1][Train] 1 steps take 2.703 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.445, backward: 0.071, update: 2.186\n",
      "[proc 0][Train](76/100000) average pos_loss: 1.5710253715515137\n",
      "[proc 0][Train](76/100000) average neg_loss: 0.5256415605545044\n",
      "[proc 0][Train](76/100000) average loss: 1.0483334064483643\n",
      "[proc 0][Train](76/100000) average regularization: 2.961820882774191e-06\n",
      "[proc 0][Train] 1 steps take 2.645 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.434, backward: 0.073, update: 2.136\n",
      "[proc 1][Train](77/100000) average pos_loss: 1.5204894542694092\n",
      "[proc 1][Train](77/100000) average neg_loss: 0.3717374801635742\n",
      "[proc 1][Train](77/100000) average loss: 0.9461134672164917\n",
      "[proc 1][Train](77/100000) average regularization: 2.9730617825407535e-06\n",
      "[proc 1][Train] 1 steps take 2.689 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.442, backward: 0.069, update: 2.176\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](77/100000) average pos_loss: 1.5509196519851685\n",
      "[proc 0][Train](77/100000) average neg_loss: 0.3982469439506531\n",
      "[proc 0][Train](77/100000) average loss: 0.9745832681655884\n",
      "[proc 0][Train](77/100000) average regularization: 2.941318143712124e-06\n",
      "[proc 0][Train] 1 steps take 2.655 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.444, backward: 0.072, update: 2.137\n",
      "[proc 1][Train](78/100000) average pos_loss: 1.5200607776641846\n",
      "[proc 1][Train](78/100000) average neg_loss: 0.5272951126098633\n",
      "[proc 1][Train](78/100000) average loss: 1.023677945137024\n",
      "[proc 1][Train](78/100000) average regularization: 3.008342901011929e-06\n",
      "[proc 1][Train] 1 steps take 2.672 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.432, backward: 0.070, update: 2.168\n",
      "[proc 0][Train](78/100000) average pos_loss: 1.5037167072296143\n",
      "[proc 0][Train](78/100000) average neg_loss: 0.4786224365234375\n",
      "[proc 0][Train](78/100000) average loss: 0.9911695718765259\n",
      "[proc 0][Train](78/100000) average regularization: 2.960464144052821e-06\n",
      "[proc 0][Train] 1 steps take 2.681 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.433, backward: 0.070, update: 2.176\n",
      "[proc 1][Train](79/100000) average pos_loss: 1.4998353719711304\n",
      "[proc 1][Train](79/100000) average neg_loss: 0.3990316689014435\n",
      "[proc 1][Train](79/100000) average loss: 0.9494335055351257\n",
      "[proc 1][Train](79/100000) average regularization: 3.054514536415809e-06\n",
      "[proc 1][Train] 1 steps take 2.687 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.437, backward: 0.070, update: 2.179\n",
      "[proc 0][Train](79/100000) average pos_loss: 1.5324227809906006\n",
      "[proc 0][Train](79/100000) average neg_loss: 0.3546496033668518\n",
      "[proc 0][Train](79/100000) average loss: 0.9435361623764038\n",
      "[proc 0][Train](79/100000) average regularization: 3.0261035135481507e-06\n",
      "[proc 0][Train] 1 steps take 2.662 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.461, backward: 0.070, update: 2.129\n",
      "[proc 1][Train](80/100000) average pos_loss: 1.5089792013168335\n",
      "[proc 1][Train](80/100000) average neg_loss: 0.5023206472396851\n",
      "[proc 1][Train](80/100000) average loss: 1.0056499242782593\n",
      "[proc 1][Train](80/100000) average regularization: 3.1817614853935083e-06\n",
      "[proc 1][Train] 1 steps take 2.685 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.435, backward: 0.070, update: 2.179\n",
      "[proc 0][Train](80/100000) average pos_loss: 1.4800183773040771\n",
      "[proc 0][Train](80/100000) average neg_loss: 0.491519570350647\n",
      "[proc 0][Train](80/100000) average loss: 0.9857689738273621\n",
      "[proc 0][Train](80/100000) average regularization: 3.036485622942564e-06\n",
      "[proc 0][Train] 1 steps take 2.609 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.407, backward: 0.070, update: 2.131\n",
      "[proc 1][Train](81/100000) average pos_loss: 1.4212431907653809\n",
      "[proc 1][Train](81/100000) average neg_loss: 0.36924493312835693\n",
      "[proc 1][Train](81/100000) average loss: 0.8952440619468689\n",
      "[proc 1][Train](81/100000) average regularization: 3.2026957796915667e-06\n",
      "[proc 1][Train] 1 steps take 2.574 seconds\n",
      "[proc 1]sample: 0.018, forward: 0.407, backward: 0.070, update: 2.078\n",
      "[proc 0][Train](81/100000) average pos_loss: 1.4342352151870728\n",
      "[proc 0][Train](81/100000) average neg_loss: 0.3734799027442932\n",
      "[proc 0][Train](81/100000) average loss: 0.9038575887680054\n",
      "[proc 0][Train](81/100000) average regularization: 3.11056692225975e-06\n",
      "[proc 0][Train] 1 steps take 2.646 seconds\n",
      "[proc 0]sample: 0.018, forward: 0.439, backward: 0.069, update: 2.120\n",
      "[proc 1][Train](82/100000) average pos_loss: 1.5076793432235718\n",
      "[proc 1][Train](82/100000) average neg_loss: 0.532734215259552\n",
      "[proc 1][Train](82/100000) average loss: 1.0202068090438843\n",
      "[proc 1][Train](82/100000) average regularization: 3.167667955494835e-06\n",
      "[proc 1][Train] 1 steps take 2.868 seconds\n",
      "[proc 1]sample: 0.015, forward: 0.429, backward: 0.070, update: 2.355\n",
      "[proc 0][Train](82/100000) average pos_loss: 1.5064194202423096\n",
      "[proc 0][Train](82/100000) average neg_loss: 0.5269798040390015\n",
      "[proc 0][Train](82/100000) average loss: 1.0166995525360107\n",
      "[proc 0][Train](82/100000) average regularization: 3.097191211054451e-06\n",
      "[proc 0][Train] 1 steps take 2.712 seconds\n",
      "[proc 0]sample: 0.017, forward: 0.432, backward: 0.069, update: 2.193\n",
      "[proc 1][Train](83/100000) average pos_loss: 1.4778378009796143\n",
      "[proc 1][Train](83/100000) average neg_loss: 0.373771071434021\n",
      "[proc 1][Train](83/100000) average loss: 0.9258044362068176\n",
      "[proc 1][Train](83/100000) average regularization: 3.0176774998835754e-06\n",
      "[proc 1][Train] 1 steps take 2.670 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.436, backward: 0.070, update: 2.163\n",
      "[proc 0][Train](83/100000) average pos_loss: 1.4720925092697144\n",
      "[proc 0][Train](83/100000) average neg_loss: 0.3724919557571411\n",
      "[proc 0][Train](83/100000) average loss: 0.9222922325134277\n",
      "[proc 0][Train](83/100000) average regularization: 3.128444632238825e-06\n",
      "[proc 0][Train] 1 steps take 2.612 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.440, backward: 0.070, update: 2.100\n",
      "[proc 1][Train](84/100000) average pos_loss: 1.4007757902145386\n",
      "[proc 1][Train](84/100000) average neg_loss: 0.5695743560791016\n",
      "[proc 1][Train](84/100000) average loss: 0.9851750731468201\n",
      "[proc 1][Train](84/100000) average regularization: 3.16899149765959e-06\n",
      "[proc 1][Train] 1 steps take 2.634 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.422, backward: 0.070, update: 2.139\n",
      "[proc 0][Train](84/100000) average pos_loss: 1.366720199584961\n",
      "[proc 0][Train](84/100000) average neg_loss: 0.5115074515342712\n",
      "[proc 0][Train](84/100000) average loss: 0.9391138553619385\n",
      "[proc 0][Train](84/100000) average regularization: 3.208157295375713e-06\n",
      "[proc 0][Train] 1 steps take 2.690 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.431, backward: 0.069, update: 2.188\n",
      "[proc 1][Train](85/100000) average pos_loss: 1.4402945041656494\n",
      "[proc 1][Train](85/100000) average neg_loss: 0.38319581747055054\n",
      "[proc 1][Train](85/100000) average loss: 0.9117451906204224\n",
      "[proc 1][Train](85/100000) average regularization: 3.174123321514344e-06\n",
      "[proc 1][Train] 1 steps take 2.597 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.417, backward: 0.069, update: 2.109\n",
      "[proc 0][Train](85/100000) average pos_loss: 1.4875162839889526\n",
      "[proc 0][Train](85/100000) average neg_loss: 0.3555396795272827\n",
      "[proc 0][Train](85/100000) average loss: 0.9215279817581177\n",
      "[proc 0][Train](85/100000) average regularization: 3.2119639854499837e-06\n",
      "[proc 0][Train] 1 steps take 2.705 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.527, backward: 0.069, update: 2.107\n",
      "[proc 1][Train](86/100000) average pos_loss: 1.3830764293670654\n",
      "[proc 1][Train](86/100000) average neg_loss: 0.5249717235565186\n",
      "[proc 1][Train](86/100000) average loss: 0.954024076461792\n",
      "[proc 1][Train](86/100000) average regularization: 3.335869223519694e-06\n",
      "[proc 1][Train] 1 steps take 2.670 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.428, backward: 0.069, update: 2.171\n",
      "[proc 0][Train](86/100000) average pos_loss: 1.3716046810150146\n",
      "[proc 0][Train](86/100000) average neg_loss: 0.5060842037200928\n",
      "[proc 0][Train](86/100000) average loss: 0.9388444423675537\n",
      "[proc 0][Train](86/100000) average regularization: 3.288152356617502e-06\n",
      "[proc 0][Train] 1 steps take 2.665 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.439, backward: 0.069, update: 2.156\n",
      "[proc 1][Train](87/100000) average pos_loss: 1.4453625679016113\n",
      "[proc 1][Train](87/100000) average neg_loss: 0.36861711740493774\n",
      "[proc 1][Train](87/100000) average loss: 0.9069898128509521\n",
      "[proc 1][Train](87/100000) average regularization: 3.1796341772860615e-06\n",
      "[proc 1][Train] 1 steps take 2.669 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.424, backward: 0.070, update: 2.173\n",
      "[proc 0][Train](87/100000) average pos_loss: 1.3750239610671997\n",
      "[proc 0][Train](87/100000) average neg_loss: 0.3437618017196655\n",
      "[proc 0][Train](87/100000) average loss: 0.8593928813934326\n",
      "[proc 0][Train](87/100000) average regularization: 3.192387339367997e-06\n",
      "[proc 0][Train] 1 steps take 2.668 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.437, backward: 0.071, update: 2.159\n",
      "[proc 1][Train](88/100000) average pos_loss: 1.2905209064483643\n",
      "[proc 1][Train](88/100000) average neg_loss: 0.5403578877449036\n",
      "[proc 1][Train](88/100000) average loss: 0.9154393672943115\n",
      "[proc 1][Train](88/100000) average regularization: 3.219339305360336e-06\n",
      "[proc 1][Train] 1 steps take 2.628 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.425, backward: 0.070, update: 2.131\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](88/100000) average pos_loss: 1.443807601928711\n",
      "[proc 0][Train](88/100000) average neg_loss: 0.5417848825454712\n",
      "[proc 0][Train](88/100000) average loss: 0.9927962422370911\n",
      "[proc 0][Train](88/100000) average regularization: 3.228076138839242e-06\n",
      "[proc 0][Train] 1 steps take 2.578 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.436, backward: 0.069, update: 2.071\n",
      "[proc 1][Train](89/100000) average pos_loss: 1.3767192363739014\n",
      "[proc 1][Train](89/100000) average neg_loss: 0.3275604844093323\n",
      "[proc 1][Train](89/100000) average loss: 0.8521398305892944\n",
      "[proc 1][Train](89/100000) average regularization: 3.282636271251249e-06\n",
      "[proc 1][Train] 1 steps take 2.774 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.426, backward: 0.070, update: 2.276\n",
      "[proc 0][Train](89/100000) average pos_loss: 1.3940485715866089\n",
      "[proc 0][Train](89/100000) average neg_loss: 0.3984299302101135\n",
      "[proc 0][Train](89/100000) average loss: 0.8962392807006836\n",
      "[proc 0][Train](89/100000) average regularization: 3.287250592620694e-06\n",
      "[proc 0][Train] 1 steps take 2.682 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.428, backward: 0.070, update: 2.183\n",
      "[proc 1][Train](90/100000) average pos_loss: 1.3870863914489746\n",
      "[proc 1][Train](90/100000) average neg_loss: 0.5084311366081238\n",
      "[proc 1][Train](90/100000) average loss: 0.9477587938308716\n",
      "[proc 1][Train](90/100000) average regularization: 3.43908845934493e-06\n",
      "[proc 1][Train] 1 steps take 2.664 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.426, backward: 0.070, update: 2.166\n",
      "[proc 0][Train](90/100000) average pos_loss: 1.335855484008789\n",
      "[proc 0][Train](90/100000) average neg_loss: 0.5115202069282532\n",
      "[proc 0][Train](90/100000) average loss: 0.9236878156661987\n",
      "[proc 0][Train](90/100000) average regularization: 3.5856037357007153e-06\n",
      "[proc 0][Train] 1 steps take 2.636 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.429, backward: 0.070, update: 2.135\n",
      "[proc 1][Train](91/100000) average pos_loss: 1.2784011363983154\n",
      "[proc 1][Train](91/100000) average neg_loss: 0.4073835015296936\n",
      "[proc 1][Train](91/100000) average loss: 0.8428922891616821\n",
      "[proc 1][Train](91/100000) average regularization: 3.289067535661161e-06\n",
      "[proc 1][Train] 1 steps take 2.581 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.427, backward: 0.070, update: 2.083\n",
      "[proc 0][Train](91/100000) average pos_loss: 1.3107482194900513\n",
      "[proc 0][Train](91/100000) average neg_loss: 0.34162187576293945\n",
      "[proc 0][Train](91/100000) average loss: 0.8261850476264954\n",
      "[proc 0][Train](91/100000) average regularization: 3.394747636775719e-06\n",
      "[proc 0][Train] 1 steps take 2.624 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.409, backward: 0.070, update: 2.143\n",
      "[proc 1][Train](92/100000) average pos_loss: 1.366861343383789\n",
      "[proc 1][Train](92/100000) average neg_loss: 0.5413082838058472\n",
      "[proc 1][Train](92/100000) average loss: 0.9540848135948181\n",
      "[proc 1][Train](92/100000) average regularization: 3.482944975985447e-06\n",
      "[proc 1][Train] 1 steps take 2.628 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.426, backward: 0.070, update: 2.130\n",
      "[proc 0][Train](92/100000) average pos_loss: 1.349940538406372\n",
      "[proc 0][Train](92/100000) average neg_loss: 0.49456581473350525\n",
      "[proc 0][Train](92/100000) average loss: 0.9222531914710999\n",
      "[proc 0][Train](92/100000) average regularization: 3.3594640171941137e-06\n",
      "[proc 0][Train] 1 steps take 2.635 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.432, backward: 0.070, update: 2.132\n",
      "[proc 1][Train](93/100000) average pos_loss: 1.2885091304779053\n",
      "[proc 1][Train](93/100000) average neg_loss: 0.34501707553863525\n",
      "[proc 1][Train](93/100000) average loss: 0.8167631030082703\n",
      "[proc 1][Train](93/100000) average regularization: 3.4286533718841383e-06\n",
      "[proc 1][Train] 1 steps take 2.632 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.413, backward: 0.069, update: 2.148\n",
      "[proc 0][Train](93/100000) average pos_loss: 1.266711950302124\n",
      "[proc 0][Train](93/100000) average neg_loss: 0.36504924297332764\n",
      "[proc 0][Train](93/100000) average loss: 0.8158805966377258\n",
      "[proc 0][Train](93/100000) average regularization: 3.328376351419138e-06\n",
      "[proc 0][Train] 1 steps take 2.718 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.432, backward: 0.069, update: 2.215\n",
      "[proc 1][Train](94/100000) average pos_loss: 1.3781020641326904\n",
      "[proc 1][Train](94/100000) average neg_loss: 0.5687047243118286\n",
      "[proc 1][Train](94/100000) average loss: 0.9734033942222595\n",
      "[proc 1][Train](94/100000) average regularization: 3.410832960071275e-06\n",
      "[proc 1][Train] 1 steps take 2.616 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.431, backward: 0.071, update: 2.113\n",
      "[proc 0][Train](94/100000) average pos_loss: 1.3591527938842773\n",
      "[proc 0][Train](94/100000) average neg_loss: 0.5017149448394775\n",
      "[proc 0][Train](94/100000) average loss: 0.9304338693618774\n",
      "[proc 0][Train](94/100000) average regularization: 3.529298737703357e-06\n",
      "[proc 0][Train] 1 steps take 2.660 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.444, backward: 0.070, update: 2.145\n",
      "[proc 1][Train](95/100000) average pos_loss: 1.3790771961212158\n",
      "[proc 1][Train](95/100000) average neg_loss: 0.37941890954971313\n",
      "[proc 1][Train](95/100000) average loss: 0.8792480230331421\n",
      "[proc 1][Train](95/100000) average regularization: 3.5162418043910293e-06\n",
      "[proc 1][Train] 1 steps take 2.628 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.404, backward: 0.069, update: 2.153\n",
      "[proc 0][Train](95/100000) average pos_loss: 1.289764165878296\n",
      "[proc 0][Train](95/100000) average neg_loss: 0.36179447174072266\n",
      "[proc 0][Train](95/100000) average loss: 0.8257793188095093\n",
      "[proc 0][Train](95/100000) average regularization: 3.4297825095563894e-06\n",
      "[proc 0][Train] 1 steps take 2.606 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.440, backward: 0.069, update: 2.094\n",
      "[proc 1][Train](96/100000) average pos_loss: 1.2543189525604248\n",
      "[proc 1][Train](96/100000) average neg_loss: 0.5510137677192688\n",
      "[proc 1][Train](96/100000) average loss: 0.9026663303375244\n",
      "[proc 1][Train](96/100000) average regularization: 3.605514166338253e-06\n",
      "[proc 1][Train] 1 steps take 2.642 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.429, backward: 0.071, update: 2.141\n",
      "[proc 0][Train](96/100000) average pos_loss: 1.39673912525177\n",
      "[proc 0][Train](96/100000) average neg_loss: 0.5036065578460693\n",
      "[proc 0][Train](96/100000) average loss: 0.9501728415489197\n",
      "[proc 0][Train](96/100000) average regularization: 3.4114138998120325e-06\n",
      "[proc 0][Train] 1 steps take 2.683 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.408, backward: 0.069, update: 2.205\n",
      "[proc 1][Train](97/100000) average pos_loss: 1.3084378242492676\n",
      "[proc 1][Train](97/100000) average neg_loss: 0.3698273301124573\n",
      "[proc 1][Train](97/100000) average loss: 0.83913254737854\n",
      "[proc 1][Train](97/100000) average regularization: 3.539731551427394e-06\n",
      "[proc 1][Train] 1 steps take 2.653 seconds\n",
      "[proc 1]sample: 0.016, forward: 0.431, backward: 0.070, update: 2.136\n",
      "[proc 0][Train](97/100000) average pos_loss: 1.3079478740692139\n",
      "[proc 0][Train](97/100000) average neg_loss: 0.39299869537353516\n",
      "[proc 0][Train](97/100000) average loss: 0.8504732847213745\n",
      "[proc 0][Train](97/100000) average regularization: 3.5846226182911778e-06\n",
      "[proc 0][Train] 1 steps take 2.737 seconds\n",
      "[proc 0]sample: 0.017, forward: 0.426, backward: 0.070, update: 2.224\n",
      "[proc 1][Train](98/100000) average pos_loss: 1.3441238403320312\n",
      "[proc 1][Train](98/100000) average neg_loss: 0.5474393367767334\n",
      "[proc 1][Train](98/100000) average loss: 0.9457815885543823\n",
      "[proc 1][Train](98/100000) average regularization: 3.7166000765864737e-06\n",
      "[proc 1][Train] 1 steps take 2.760 seconds\n",
      "[proc 1]sample: 0.016, forward: 0.440, backward: 0.069, update: 2.234\n",
      "[proc 0][Train](98/100000) average pos_loss: 1.3241560459136963\n",
      "[proc 0][Train](98/100000) average neg_loss: 0.5144438743591309\n",
      "[proc 0][Train](98/100000) average loss: 0.9192999601364136\n",
      "[proc 0][Train](98/100000) average regularization: 3.5327834666532e-06\n",
      "[proc 0][Train] 1 steps take 2.678 seconds\n",
      "[proc 0]sample: 0.015, forward: 0.428, backward: 0.070, update: 2.165\n",
      "[proc 1][Train](99/100000) average pos_loss: 1.2633600234985352\n",
      "[proc 1][Train](99/100000) average neg_loss: 0.3672062158584595\n",
      "[proc 1][Train](99/100000) average loss: 0.8152831196784973\n",
      "[proc 1][Train](99/100000) average regularization: 3.4817044252122287e-06\n",
      "[proc 1][Train] 1 steps take 2.687 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.439, backward: 0.069, update: 2.176\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](99/100000) average pos_loss: 1.2455143928527832\n",
      "[proc 0][Train](99/100000) average neg_loss: 0.36318397521972656\n",
      "[proc 0][Train](99/100000) average loss: 0.8043491840362549\n",
      "[proc 0][Train](99/100000) average regularization: 3.6083683880860917e-06\n",
      "[proc 0][Train] 1 steps take 2.670 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.404, backward: 0.069, update: 2.195\n",
      "[proc 1][Train](100/100000) average pos_loss: 1.2287588119506836\n",
      "[proc 1][Train](100/100000) average neg_loss: 0.5226110219955444\n",
      "[proc 1][Train](100/100000) average loss: 0.875684916973114\n",
      "[proc 1][Train](100/100000) average regularization: 3.691867959787487e-06\n",
      "[proc 1][Train] 1 steps take 2.655 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.424, backward: 0.071, update: 2.158\n",
      "[proc 0][Train](100/100000) average pos_loss: 1.2744104862213135\n",
      "[proc 0][Train](100/100000) average neg_loss: 0.5199477076530457\n",
      "[proc 0][Train](100/100000) average loss: 0.897179126739502\n",
      "[proc 0][Train](100/100000) average regularization: 3.574524043870042e-06\n",
      "[proc 0][Train] 1 steps take 2.700 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.410, backward: 0.071, update: 2.218\n",
      "[proc 1][Train](101/100000) average pos_loss: 1.2851375341415405\n",
      "[proc 1][Train](101/100000) average neg_loss: 0.362371563911438\n",
      "[proc 1][Train](101/100000) average loss: 0.8237545490264893\n",
      "[proc 1][Train](101/100000) average regularization: 3.610885187299573e-06\n",
      "[proc 1][Train] 1 steps take 2.669 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.437, backward: 0.071, update: 2.159\n",
      "[proc 0][Train](101/100000) average pos_loss: 1.3068649768829346\n",
      "[proc 0][Train](101/100000) average neg_loss: 0.37772923707962036\n",
      "[proc 0][Train](101/100000) average loss: 0.8422970771789551\n",
      "[proc 0][Train](101/100000) average regularization: 3.7293343666533474e-06\n",
      "[proc 0][Train] 1 steps take 2.621 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.414, backward: 0.070, update: 2.136\n",
      "[proc 1][Train](102/100000) average pos_loss: 1.2658230066299438\n",
      "[proc 1][Train](102/100000) average neg_loss: 0.5052257776260376\n",
      "[proc 1][Train](102/100000) average loss: 0.8855243921279907\n",
      "[proc 1][Train](102/100000) average regularization: 3.5762268453254364e-06\n",
      "[proc 1][Train] 1 steps take 2.652 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.435, backward: 0.070, update: 2.145\n",
      "[proc 0][Train](102/100000) average pos_loss: 1.2153221368789673\n",
      "[proc 0][Train](102/100000) average neg_loss: 0.5364200472831726\n",
      "[proc 0][Train](102/100000) average loss: 0.8758710622787476\n",
      "[proc 0][Train](102/100000) average regularization: 3.6259636999602662e-06\n",
      "[proc 0][Train] 1 steps take 2.623 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.412, backward: 0.071, update: 2.140\n",
      "[proc 1][Train](103/100000) average pos_loss: 1.231119155883789\n",
      "[proc 1][Train](103/100000) average neg_loss: 0.3597433269023895\n",
      "[proc 1][Train](103/100000) average loss: 0.7954312562942505\n",
      "[proc 1][Train](103/100000) average regularization: 3.7569234336842783e-06\n",
      "[proc 1][Train] 1 steps take 2.612 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.434, backward: 0.070, update: 2.106\n",
      "[proc 0][Train](103/100000) average pos_loss: 1.223183035850525\n",
      "[proc 0][Train](103/100000) average neg_loss: 0.34047937393188477\n",
      "[proc 0][Train](103/100000) average loss: 0.7818312048912048\n",
      "[proc 0][Train](103/100000) average regularization: 3.7606453133776085e-06\n",
      "[proc 0][Train] 1 steps take 2.674 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.441, backward: 0.070, update: 2.161\n",
      "[proc 1][Train](104/100000) average pos_loss: 1.2390074729919434\n",
      "[proc 1][Train](104/100000) average neg_loss: 0.5387769341468811\n",
      "[proc 1][Train](104/100000) average loss: 0.8888921737670898\n",
      "[proc 1][Train](104/100000) average regularization: 3.843670128844678e-06\n",
      "[proc 1][Train] 1 steps take 2.661 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.422, backward: 0.069, update: 2.168\n",
      "[proc 0][Train](104/100000) average pos_loss: 1.232169270515442\n",
      "[proc 0][Train](104/100000) average neg_loss: 0.5531923770904541\n",
      "[proc 0][Train](104/100000) average loss: 0.892680823802948\n",
      "[proc 0][Train](104/100000) average regularization: 3.779981625484652e-06\n",
      "[proc 0][Train] 1 steps take 2.607 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.435, backward: 0.070, update: 2.100\n",
      "[proc 1][Train](105/100000) average pos_loss: 1.2523343563079834\n",
      "[proc 1][Train](105/100000) average neg_loss: 0.3446754813194275\n",
      "[proc 1][Train](105/100000) average loss: 0.7985049486160278\n",
      "[proc 1][Train](105/100000) average regularization: 3.7709553453169065e-06\n",
      "[proc 1][Train] 1 steps take 2.654 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.415, backward: 0.071, update: 2.167\n",
      "[proc 0][Train](105/100000) average pos_loss: 1.1739643812179565\n",
      "[proc 0][Train](105/100000) average neg_loss: 0.3330155313014984\n",
      "[proc 0][Train](105/100000) average loss: 0.7534899711608887\n",
      "[proc 0][Train](105/100000) average regularization: 3.688582637551008e-06\n",
      "[proc 0][Train] 1 steps take 2.661 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.431, backward: 0.069, update: 2.160\n",
      "[proc 1][Train](106/100000) average pos_loss: 1.1633403301239014\n",
      "[proc 1][Train](106/100000) average neg_loss: 0.5276277661323547\n",
      "[proc 1][Train](106/100000) average loss: 0.8454840183258057\n",
      "[proc 1][Train](106/100000) average regularization: 3.687335720314877e-06\n",
      "[proc 1][Train] 1 steps take 2.630 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.428, backward: 0.070, update: 2.131\n",
      "[proc 0][Train](106/100000) average pos_loss: 1.2572097778320312\n",
      "[proc 0][Train](106/100000) average neg_loss: 0.4903738498687744\n",
      "[proc 0][Train](106/100000) average loss: 0.8737918138504028\n",
      "[proc 0][Train](106/100000) average regularization: 3.653537760328618e-06\n",
      "[proc 0][Train] 1 steps take 2.635 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.432, backward: 0.070, update: 2.131\n",
      "[proc 1][Train](107/100000) average pos_loss: 1.1921188831329346\n",
      "[proc 1][Train](107/100000) average neg_loss: 0.3775377869606018\n",
      "[proc 1][Train](107/100000) average loss: 0.7848283052444458\n",
      "[proc 1][Train](107/100000) average regularization: 3.847329935524613e-06\n",
      "[proc 1][Train] 1 steps take 2.659 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.440, backward: 0.070, update: 2.147\n",
      "[proc 0][Train](107/100000) average pos_loss: 1.2330906391143799\n",
      "[proc 0][Train](107/100000) average neg_loss: 0.3609016239643097\n",
      "[proc 0][Train](107/100000) average loss: 0.7969961166381836\n",
      "[proc 0][Train](107/100000) average regularization: 3.751797521545086e-06\n",
      "[proc 0][Train] 1 steps take 2.581 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.402, backward: 0.070, update: 2.108\n",
      "[proc 1][Train](108/100000) average pos_loss: 1.192420244216919\n",
      "[proc 1][Train](108/100000) average neg_loss: 0.5534849762916565\n",
      "[proc 1][Train](108/100000) average loss: 0.8729525804519653\n",
      "[proc 1][Train](108/100000) average regularization: 3.944197942473693e-06\n",
      "[proc 1][Train] 1 steps take 2.646 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.430, backward: 0.069, update: 2.145\n",
      "[proc 0][Train](108/100000) average pos_loss: 1.1725459098815918\n",
      "[proc 0][Train](108/100000) average neg_loss: 0.5231863260269165\n",
      "[proc 0][Train](108/100000) average loss: 0.8478661179542542\n",
      "[proc 0][Train](108/100000) average regularization: 3.7072434224683093e-06\n",
      "[proc 0][Train] 1 steps take 2.675 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.433, backward: 0.070, update: 2.170\n",
      "[proc 1][Train](109/100000) average pos_loss: 1.2440316677093506\n",
      "[proc 1][Train](109/100000) average neg_loss: 0.3467693030834198\n",
      "[proc 1][Train](109/100000) average loss: 0.7954005002975464\n",
      "[proc 1][Train](109/100000) average regularization: 3.7099762266734615e-06\n",
      "[proc 1][Train] 1 steps take 2.619 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.421, backward: 0.069, update: 2.127\n",
      "[proc 0][Train](109/100000) average pos_loss: 1.2198938131332397\n",
      "[proc 0][Train](109/100000) average neg_loss: 0.35161373019218445\n",
      "[proc 0][Train](109/100000) average loss: 0.7857537865638733\n",
      "[proc 0][Train](109/100000) average regularization: 3.7745739973615855e-06\n",
      "[proc 0][Train] 1 steps take 2.637 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.431, backward: 0.069, update: 2.135\n",
      "[proc 1][Train](110/100000) average pos_loss: 1.204467535018921\n",
      "[proc 1][Train](110/100000) average neg_loss: 0.5478365421295166\n",
      "[proc 1][Train](110/100000) average loss: 0.8761520385742188\n",
      "[proc 1][Train](110/100000) average regularization: 3.892604127031518e-06\n",
      "[proc 1][Train] 1 steps take 2.640 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.430, backward: 0.071, update: 2.137\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](110/100000) average pos_loss: 1.1528018712997437\n",
      "[proc 0][Train](110/100000) average neg_loss: 0.5700418949127197\n",
      "[proc 0][Train](110/100000) average loss: 0.8614218831062317\n",
      "[proc 0][Train](110/100000) average regularization: 3.903267497662455e-06\n",
      "[proc 0][Train] 1 steps take 2.727 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.492, backward: 0.070, update: 2.164\n",
      "[proc 1][Train](111/100000) average pos_loss: 1.173447608947754\n",
      "[proc 1][Train](111/100000) average neg_loss: 0.3729130029678345\n",
      "[proc 1][Train](111/100000) average loss: 0.7731803059577942\n",
      "[proc 1][Train](111/100000) average regularization: 3.871577064273879e-06\n",
      "[proc 1][Train] 1 steps take 2.730 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.405, backward: 0.069, update: 2.254\n",
      "[proc 0][Train](111/100000) average pos_loss: 1.1785063743591309\n",
      "[proc 0][Train](111/100000) average neg_loss: 0.3667331039905548\n",
      "[proc 0][Train](111/100000) average loss: 0.7726197242736816\n",
      "[proc 0][Train](111/100000) average regularization: 3.7732957025582436e-06\n",
      "[proc 0][Train] 1 steps take 2.679 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.432, backward: 0.071, update: 2.175\n",
      "[proc 1][Train](112/100000) average pos_loss: 1.1922709941864014\n",
      "[proc 1][Train](112/100000) average neg_loss: 0.5369095206260681\n",
      "[proc 1][Train](112/100000) average loss: 0.8645902872085571\n",
      "[proc 1][Train](112/100000) average regularization: 3.8574839891225565e-06\n",
      "[proc 1][Train] 1 steps take 2.671 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.432, backward: 0.070, update: 2.168\n",
      "[proc 0][Train](112/100000) average pos_loss: 1.1852004528045654\n",
      "[proc 0][Train](112/100000) average neg_loss: 0.545214056968689\n",
      "[proc 0][Train](112/100000) average loss: 0.8652072548866272\n",
      "[proc 0][Train](112/100000) average regularization: 3.861124241666403e-06\n",
      "[proc 0][Train] 1 steps take 2.776 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.439, backward: 0.070, update: 2.266\n",
      "[proc 1][Train](113/100000) average pos_loss: 1.1906332969665527\n",
      "[proc 1][Train](113/100000) average neg_loss: 0.38469168543815613\n",
      "[proc 1][Train](113/100000) average loss: 0.7876625061035156\n",
      "[proc 1][Train](113/100000) average regularization: 3.9887545426608995e-06\n",
      "[proc 1][Train] 1 steps take 2.676 seconds\n",
      "[proc 1]sample: 0.016, forward: 0.420, backward: 0.070, update: 2.171\n",
      "[proc 0][Train](113/100000) average pos_loss: 1.1437416076660156\n",
      "[proc 0][Train](113/100000) average neg_loss: 0.3334537744522095\n",
      "[proc 0][Train](113/100000) average loss: 0.7385976910591125\n",
      "[proc 0][Train](113/100000) average regularization: 4.076776349393185e-06\n",
      "[proc 0][Train] 1 steps take 2.742 seconds\n",
      "[proc 0]sample: 0.019, forward: 0.423, backward: 0.070, update: 2.230\n",
      "[proc 1][Train](114/100000) average pos_loss: 1.1271158456802368\n",
      "[proc 1][Train](114/100000) average neg_loss: 0.5510358810424805\n",
      "[proc 1][Train](114/100000) average loss: 0.8390758633613586\n",
      "[proc 1][Train](114/100000) average regularization: 4.0286554394697305e-06\n",
      "[proc 1][Train] 1 steps take 2.727 seconds\n",
      "[proc 1]sample: 0.018, forward: 0.433, backward: 0.069, update: 2.207\n",
      "[proc 0][Train](114/100000) average pos_loss: 1.2195464372634888\n",
      "[proc 0][Train](114/100000) average neg_loss: 0.576662540435791\n",
      "[proc 0][Train](114/100000) average loss: 0.8981044888496399\n",
      "[proc 0][Train](114/100000) average regularization: 4.013331817986909e-06\n",
      "[proc 0][Train] 1 steps take 2.675 seconds\n",
      "[proc 0]sample: 0.017, forward: 0.431, backward: 0.069, update: 2.157\n",
      "[proc 1][Train](115/100000) average pos_loss: 1.2072687149047852\n",
      "[proc 1][Train](115/100000) average neg_loss: 0.35137131810188293\n",
      "[proc 1][Train](115/100000) average loss: 0.7793200016021729\n",
      "[proc 1][Train](115/100000) average regularization: 4.0782733776723035e-06\n",
      "[proc 1][Train] 1 steps take 2.738 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.395, backward: 0.070, update: 2.271\n",
      "[proc 0][Train](115/100000) average pos_loss: 1.1103501319885254\n",
      "[proc 0][Train](115/100000) average neg_loss: 0.3339252471923828\n",
      "[proc 0][Train](115/100000) average loss: 0.7221376895904541\n",
      "[proc 0][Train](115/100000) average regularization: 4.0332815842702985e-06\n",
      "[proc 0][Train] 1 steps take 2.674 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.431, backward: 0.070, update: 2.173\n",
      "[proc 1][Train](116/100000) average pos_loss: 1.0899395942687988\n",
      "[proc 1][Train](116/100000) average neg_loss: 0.5438458323478699\n",
      "[proc 1][Train](116/100000) average loss: 0.8168927431106567\n",
      "[proc 1][Train](116/100000) average regularization: 3.969134013459552e-06\n",
      "[proc 1][Train] 1 steps take 2.672 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.431, backward: 0.069, update: 2.170\n",
      "[proc 0][Train](116/100000) average pos_loss: 1.1342240571975708\n",
      "[proc 0][Train](116/100000) average neg_loss: 0.5827693939208984\n",
      "[proc 0][Train](116/100000) average loss: 0.8584967255592346\n",
      "[proc 0][Train](116/100000) average regularization: 3.9410147110174876e-06\n",
      "[proc 0][Train] 1 steps take 2.662 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.428, backward: 0.069, update: 2.163\n",
      "[proc 1][Train](117/100000) average pos_loss: 1.0688838958740234\n",
      "[proc 1][Train](117/100000) average neg_loss: 0.34877780079841614\n",
      "[proc 1][Train](117/100000) average loss: 0.7088308334350586\n",
      "[proc 1][Train](117/100000) average regularization: 4.159890977462055e-06\n",
      "[proc 1][Train] 1 steps take 2.653 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.430, backward: 0.069, update: 2.152\n",
      "[proc 0][Train](117/100000) average pos_loss: 1.1806594133377075\n",
      "[proc 0][Train](117/100000) average neg_loss: 0.3008992075920105\n",
      "[proc 0][Train](117/100000) average loss: 0.7407792806625366\n",
      "[proc 0][Train](117/100000) average regularization: 4.107134827791015e-06\n",
      "[proc 0][Train] 1 steps take 2.657 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.425, backward: 0.070, update: 2.160\n",
      "[proc 1][Train](118/100000) average pos_loss: 1.1033141613006592\n",
      "[proc 1][Train](118/100000) average neg_loss: 0.5791289806365967\n",
      "[proc 1][Train](118/100000) average loss: 0.8412215709686279\n",
      "[proc 1][Train](118/100000) average regularization: 4.019075277028605e-06\n",
      "[proc 1][Train] 1 steps take 2.684 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.441, backward: 0.070, update: 2.171\n",
      "[proc 0][Train](118/100000) average pos_loss: 1.133440375328064\n",
      "[proc 0][Train](118/100000) average neg_loss: 0.5521506071090698\n",
      "[proc 0][Train](118/100000) average loss: 0.8427954912185669\n",
      "[proc 0][Train](118/100000) average regularization: 4.074987828062149e-06\n",
      "[proc 0][Train] 1 steps take 2.622 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.429, backward: 0.070, update: 2.122\n",
      "[proc 1][Train](119/100000) average pos_loss: 1.1061923503875732\n",
      "[proc 1][Train](119/100000) average neg_loss: 0.3482547104358673\n",
      "[proc 1][Train](119/100000) average loss: 0.7272235155105591\n",
      "[proc 1][Train](119/100000) average regularization: 4.211246960039716e-06\n",
      "[proc 1][Train] 1 steps take 2.798 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.430, backward: 0.069, update: 2.297\n",
      "[proc 0][Train](119/100000) average pos_loss: 1.135471224784851\n",
      "[proc 0][Train](119/100000) average neg_loss: 0.3602507710456848\n",
      "[proc 0][Train](119/100000) average loss: 0.7478610277175903\n",
      "[proc 0][Train](119/100000) average regularization: 4.143165824643802e-06\n",
      "[proc 0][Train] 1 steps take 2.659 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.433, backward: 0.070, update: 2.154\n",
      "[proc 1][Train](120/100000) average pos_loss: 1.170667290687561\n",
      "[proc 1][Train](120/100000) average neg_loss: 0.5259691476821899\n",
      "[proc 1][Train](120/100000) average loss: 0.8483182191848755\n",
      "[proc 1][Train](120/100000) average regularization: 4.15345039073145e-06\n",
      "[proc 1][Train] 1 steps take 2.618 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.431, backward: 0.069, update: 2.116\n",
      "[proc 0][Train](120/100000) average pos_loss: 1.1107879877090454\n",
      "[proc 0][Train](120/100000) average neg_loss: 0.5505994558334351\n",
      "[proc 0][Train](120/100000) average loss: 0.8306937217712402\n",
      "[proc 0][Train](120/100000) average regularization: 4.133824859309243e-06\n",
      "[proc 0][Train] 1 steps take 2.621 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.444, backward: 0.070, update: 2.106\n",
      "[proc 1][Train](121/100000) average pos_loss: 1.1936551332473755\n",
      "[proc 1][Train](121/100000) average neg_loss: 0.3376688063144684\n",
      "[proc 1][Train](121/100000) average loss: 0.7656619548797607\n",
      "[proc 1][Train](121/100000) average regularization: 4.128382443013834e-06\n",
      "[proc 1][Train] 1 steps take 2.664 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.424, backward: 0.070, update: 2.168\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](121/100000) average pos_loss: 1.1143460273742676\n",
      "[proc 0][Train](121/100000) average neg_loss: 0.3600374460220337\n",
      "[proc 0][Train](121/100000) average loss: 0.7371917366981506\n",
      "[proc 0][Train](121/100000) average regularization: 4.106740561837796e-06\n",
      "[proc 0][Train] 1 steps take 2.645 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.443, backward: 0.069, update: 2.132\n",
      "[proc 1][Train](122/100000) average pos_loss: 1.135267734527588\n",
      "[proc 1][Train](122/100000) average neg_loss: 0.5426846742630005\n",
      "[proc 1][Train](122/100000) average loss: 0.8389762043952942\n",
      "[proc 1][Train](122/100000) average regularization: 4.016516868432518e-06\n",
      "[proc 1][Train] 1 steps take 2.632 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.424, backward: 0.070, update: 2.137\n",
      "[proc 0][Train](122/100000) average pos_loss: 1.1176319122314453\n",
      "[proc 0][Train](122/100000) average neg_loss: 0.5417845845222473\n",
      "[proc 0][Train](122/100000) average loss: 0.8297082185745239\n",
      "[proc 0][Train](122/100000) average regularization: 4.010005341115175e-06\n",
      "[proc 0][Train] 1 steps take 2.630 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.436, backward: 0.070, update: 2.123\n",
      "[proc 1][Train](123/100000) average pos_loss: 1.1060272455215454\n",
      "[proc 1][Train](123/100000) average neg_loss: 0.35806804895401\n",
      "[proc 1][Train](123/100000) average loss: 0.7320476770401001\n",
      "[proc 1][Train](123/100000) average regularization: 4.167090082773939e-06\n",
      "[proc 1][Train] 1 steps take 2.617 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.420, backward: 0.070, update: 2.126\n",
      "[proc 0][Train](123/100000) average pos_loss: 1.0456924438476562\n",
      "[proc 0][Train](123/100000) average neg_loss: 0.3173714280128479\n",
      "[proc 0][Train](123/100000) average loss: 0.6815319061279297\n",
      "[proc 0][Train](123/100000) average regularization: 4.1823072933766525e-06\n",
      "[proc 0][Train] 1 steps take 2.630 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.436, backward: 0.070, update: 2.124\n",
      "[proc 1][Train](124/100000) average pos_loss: 1.0880918502807617\n",
      "[proc 1][Train](124/100000) average neg_loss: 0.5754326581954956\n",
      "[proc 1][Train](124/100000) average loss: 0.8317622542381287\n",
      "[proc 1][Train](124/100000) average regularization: 4.199076101940591e-06\n",
      "[proc 1][Train] 1 steps take 2.591 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.415, backward: 0.071, update: 2.104\n",
      "[proc 0][Train](124/100000) average pos_loss: 1.1543904542922974\n",
      "[proc 0][Train](124/100000) average neg_loss: 0.5391985177993774\n",
      "[proc 0][Train](124/100000) average loss: 0.8467944860458374\n",
      "[proc 0][Train](124/100000) average regularization: 4.173335128143663e-06\n",
      "[proc 0][Train] 1 steps take 2.615 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.411, backward: 0.070, update: 2.133\n",
      "[proc 1][Train](125/100000) average pos_loss: 1.0926814079284668\n",
      "[proc 1][Train](125/100000) average neg_loss: 0.3410637378692627\n",
      "[proc 1][Train](125/100000) average loss: 0.7168725728988647\n",
      "[proc 1][Train](125/100000) average regularization: 4.244250703777652e-06\n",
      "[proc 1][Train] 1 steps take 2.553 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.424, backward: 0.070, update: 2.058\n",
      "[proc 0][Train](125/100000) average pos_loss: 1.068997859954834\n",
      "[proc 0][Train](125/100000) average neg_loss: 0.32647231221199036\n",
      "[proc 0][Train](125/100000) average loss: 0.697735071182251\n",
      "[proc 0][Train](125/100000) average regularization: 4.208469817967853e-06\n",
      "[proc 0][Train] 1 steps take 2.590 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.439, backward: 0.069, update: 2.080\n",
      "[proc 1][Train](126/100000) average pos_loss: 1.074122428894043\n",
      "[proc 1][Train](126/100000) average neg_loss: 0.5541123151779175\n",
      "[proc 1][Train](126/100000) average loss: 0.8141173720359802\n",
      "[proc 1][Train](126/100000) average regularization: 4.1378002606506925e-06\n",
      "[proc 1][Train] 1 steps take 2.638 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.415, backward: 0.070, update: 2.152\n",
      "[proc 0][Train](126/100000) average pos_loss: 1.0656627416610718\n",
      "[proc 0][Train](126/100000) average neg_loss: 0.5265318751335144\n",
      "[proc 0][Train](126/100000) average loss: 0.7960972785949707\n",
      "[proc 0][Train](126/100000) average regularization: 4.2254168874933384e-06\n",
      "[proc 0][Train] 1 steps take 2.639 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.439, backward: 0.069, update: 2.129\n",
      "[proc 1][Train](127/100000) average pos_loss: 1.067790150642395\n",
      "[proc 1][Train](127/100000) average neg_loss: 0.37828561663627625\n",
      "[proc 1][Train](127/100000) average loss: 0.7230378985404968\n",
      "[proc 1][Train](127/100000) average regularization: 4.2965752982127015e-06\n",
      "[proc 1][Train] 1 steps take 2.589 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.427, backward: 0.070, update: 2.092\n",
      "[proc 0][Train](127/100000) average pos_loss: 1.105796456336975\n",
      "[proc 0][Train](127/100000) average neg_loss: 0.3302328884601593\n",
      "[proc 0][Train](127/100000) average loss: 0.718014657497406\n",
      "[proc 0][Train](127/100000) average regularization: 4.2062151806021575e-06\n",
      "[proc 0][Train] 1 steps take 2.720 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.436, backward: 0.069, update: 2.213\n",
      "[proc 1][Train](128/100000) average pos_loss: 1.0402700901031494\n",
      "[proc 1][Train](128/100000) average neg_loss: 0.5882683396339417\n",
      "[proc 1][Train](128/100000) average loss: 0.8142691850662231\n",
      "[proc 1][Train](128/100000) average regularization: 4.350165454525268e-06\n",
      "[proc 1][Train] 1 steps take 2.723 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.426, backward: 0.071, update: 2.224\n",
      "[proc 0][Train](128/100000) average pos_loss: 1.1209123134613037\n",
      "[proc 0][Train](128/100000) average neg_loss: 0.5352867841720581\n",
      "[proc 0][Train](128/100000) average loss: 0.8280995488166809\n",
      "[proc 0][Train](128/100000) average regularization: 4.167465704085771e-06\n",
      "[proc 0][Train] 1 steps take 2.679 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.432, backward: 0.070, update: 2.175\n",
      "[proc 1][Train](129/100000) average pos_loss: 1.0806976556777954\n",
      "[proc 1][Train](129/100000) average neg_loss: 0.354941189289093\n",
      "[proc 1][Train](129/100000) average loss: 0.7178194522857666\n",
      "[proc 1][Train](129/100000) average regularization: 4.3693439693015534e-06\n",
      "[proc 1][Train] 1 steps take 2.635 seconds\n",
      "[proc 1]sample: 0.014, forward: 0.423, backward: 0.069, update: 2.128\n",
      "[proc 0][Train](129/100000) average pos_loss: 1.0721919536590576\n",
      "[proc 0][Train](129/100000) average neg_loss: 0.3419564962387085\n",
      "[proc 0][Train](129/100000) average loss: 0.7070742249488831\n",
      "[proc 0][Train](129/100000) average regularization: 4.168987743469188e-06\n",
      "[proc 0][Train] 1 steps take 2.644 seconds\n",
      "[proc 0]sample: 0.015, forward: 0.420, backward: 0.070, update: 2.138\n",
      "[proc 1][Train](130/100000) average pos_loss: 1.0405223369598389\n",
      "[proc 1][Train](130/100000) average neg_loss: 0.5645447969436646\n",
      "[proc 1][Train](130/100000) average loss: 0.8025335669517517\n",
      "[proc 1][Train](130/100000) average regularization: 4.181705207884079e-06\n",
      "[proc 1][Train] 1 steps take 2.668 seconds\n",
      "[proc 1]sample: 0.014, forward: 0.403, backward: 0.070, update: 2.181\n",
      "[proc 0][Train](130/100000) average pos_loss: 1.01255202293396\n",
      "[proc 0][Train](130/100000) average neg_loss: 0.5573586225509644\n",
      "[proc 0][Train](130/100000) average loss: 0.7849553227424622\n",
      "[proc 0][Train](130/100000) average regularization: 4.328672730480321e-06\n",
      "[proc 0][Train] 1 steps take 2.790 seconds\n",
      "[proc 0]sample: 0.019, forward: 0.424, backward: 0.070, update: 2.276\n",
      "[proc 1][Train](131/100000) average pos_loss: 1.101139783859253\n",
      "[proc 1][Train](131/100000) average neg_loss: 0.3059992790222168\n",
      "[proc 1][Train](131/100000) average loss: 0.7035695314407349\n",
      "[proc 1][Train](131/100000) average regularization: 4.4026960495102685e-06\n",
      "[proc 1][Train] 1 steps take 2.632 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.415, backward: 0.069, update: 2.147\n",
      "[proc 0][Train](131/100000) average pos_loss: 0.994500994682312\n",
      "[proc 0][Train](131/100000) average neg_loss: 0.3505246043205261\n",
      "[proc 0][Train](131/100000) average loss: 0.6725127696990967\n",
      "[proc 0][Train](131/100000) average regularization: 4.3646455196721945e-06\n",
      "[proc 0][Train] 1 steps take 2.721 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.423, backward: 0.070, update: 2.226\n",
      "[proc 1][Train](132/100000) average pos_loss: 1.0267928838729858\n",
      "[proc 1][Train](132/100000) average neg_loss: 0.5815162062644958\n",
      "[proc 1][Train](132/100000) average loss: 0.8041545152664185\n",
      "[proc 1][Train](132/100000) average regularization: 4.2928859329549596e-06\n",
      "[proc 1][Train] 1 steps take 2.735 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.445, backward: 0.070, update: 2.218\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](132/100000) average pos_loss: 0.9551223516464233\n",
      "[proc 0][Train](132/100000) average neg_loss: 0.5561748147010803\n",
      "[proc 0][Train](132/100000) average loss: 0.7556486129760742\n",
      "[proc 0][Train](132/100000) average regularization: 4.388291472423589e-06\n",
      "[proc 0][Train] 1 steps take 2.692 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.429, backward: 0.070, update: 2.191\n",
      "[proc 1][Train](133/100000) average pos_loss: 1.0083160400390625\n",
      "[proc 1][Train](133/100000) average neg_loss: 0.3248260021209717\n",
      "[proc 1][Train](133/100000) average loss: 0.6665710210800171\n",
      "[proc 1][Train](133/100000) average regularization: 4.421291578182718e-06\n",
      "[proc 1][Train] 1 steps take 2.680 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.432, backward: 0.070, update: 2.176\n",
      "[proc 0][Train](133/100000) average pos_loss: 1.0957579612731934\n",
      "[proc 0][Train](133/100000) average neg_loss: 0.34142327308654785\n",
      "[proc 0][Train](133/100000) average loss: 0.7185906171798706\n",
      "[proc 0][Train](133/100000) average regularization: 4.115780029678717e-06\n",
      "[proc 0][Train] 1 steps take 2.702 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.431, backward: 0.071, update: 2.199\n",
      "[proc 1][Train](134/100000) average pos_loss: 0.9519528746604919\n",
      "[proc 1][Train](134/100000) average neg_loss: 0.5388689637184143\n",
      "[proc 1][Train](134/100000) average loss: 0.7454109191894531\n",
      "[proc 1][Train](134/100000) average regularization: 4.349449682194972e-06\n",
      "[proc 1][Train] 1 steps take 2.726 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.443, backward: 0.070, update: 2.212\n",
      "[proc 0][Train](134/100000) average pos_loss: 1.0410977602005005\n",
      "[proc 0][Train](134/100000) average neg_loss: 0.5601415634155273\n",
      "[proc 0][Train](134/100000) average loss: 0.8006196618080139\n",
      "[proc 0][Train](134/100000) average regularization: 4.363786956673721e-06\n",
      "[proc 0][Train] 1 steps take 2.683 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.437, backward: 0.071, update: 2.174\n",
      "[proc 1][Train](135/100000) average pos_loss: 1.0108304023742676\n",
      "[proc 1][Train](135/100000) average neg_loss: 0.3494623899459839\n",
      "[proc 1][Train](135/100000) average loss: 0.6801463961601257\n",
      "[proc 1][Train](135/100000) average regularization: 4.7681132855359465e-06\n",
      "[proc 1][Train] 1 steps take 2.732 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.436, backward: 0.069, update: 2.225\n",
      "[proc 0][Train](135/100000) average pos_loss: 1.0127233266830444\n",
      "[proc 0][Train](135/100000) average neg_loss: 0.3304017186164856\n",
      "[proc 0][Train](135/100000) average loss: 0.6715625524520874\n",
      "[proc 0][Train](135/100000) average regularization: 4.516982244240353e-06\n",
      "[proc 0][Train] 1 steps take 2.662 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.426, backward: 0.070, update: 2.165\n",
      "[proc 1][Train](136/100000) average pos_loss: 1.0198612213134766\n",
      "[proc 1][Train](136/100000) average neg_loss: 0.5713139772415161\n",
      "[proc 1][Train](136/100000) average loss: 0.7955875992774963\n",
      "[proc 1][Train](136/100000) average regularization: 4.463179266167572e-06\n",
      "[proc 1][Train] 1 steps take 2.676 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.435, backward: 0.070, update: 2.170\n",
      "[proc 0][Train](136/100000) average pos_loss: 1.0451996326446533\n",
      "[proc 0][Train](136/100000) average neg_loss: 0.5698975324630737\n",
      "[proc 0][Train](136/100000) average loss: 0.8075485825538635\n",
      "[proc 0][Train](136/100000) average regularization: 4.5926321945444215e-06\n",
      "[proc 0][Train] 1 steps take 2.639 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.413, backward: 0.070, update: 2.155\n",
      "[proc 1][Train](137/100000) average pos_loss: 1.052296757698059\n",
      "[proc 1][Train](137/100000) average neg_loss: 0.33685851097106934\n",
      "[proc 1][Train](137/100000) average loss: 0.6945776343345642\n",
      "[proc 1][Train](137/100000) average regularization: 4.595061454892857e-06\n",
      "[proc 1][Train] 1 steps take 2.631 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.405, backward: 0.070, update: 2.155\n",
      "[proc 0][Train](137/100000) average pos_loss: 0.9618362188339233\n",
      "[proc 0][Train](137/100000) average neg_loss: 0.344959020614624\n",
      "[proc 0][Train](137/100000) average loss: 0.6533976197242737\n",
      "[proc 0][Train](137/100000) average regularization: 4.458230250747874e-06\n",
      "[proc 0][Train] 1 steps take 2.600 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.430, backward: 0.070, update: 2.099\n",
      "[proc 1][Train](138/100000) average pos_loss: 0.9622223377227783\n",
      "[proc 1][Train](138/100000) average neg_loss: 0.5924109816551208\n",
      "[proc 1][Train](138/100000) average loss: 0.777316689491272\n",
      "[proc 1][Train](138/100000) average regularization: 4.54291239293525e-06\n",
      "[proc 1][Train] 1 steps take 2.641 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.426, backward: 0.070, update: 2.143\n",
      "[proc 0][Train](138/100000) average pos_loss: 0.996511697769165\n",
      "[proc 0][Train](138/100000) average neg_loss: 0.5001246929168701\n",
      "[proc 0][Train](138/100000) average loss: 0.7483181953430176\n",
      "[proc 0][Train](138/100000) average regularization: 4.457305749383522e-06\n",
      "[proc 0][Train] 1 steps take 2.577 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.432, backward: 0.071, update: 2.072\n",
      "[proc 1][Train](139/100000) average pos_loss: 1.0554815530776978\n",
      "[proc 1][Train](139/100000) average neg_loss: 0.3468971252441406\n",
      "[proc 1][Train](139/100000) average loss: 0.7011893391609192\n",
      "[proc 1][Train](139/100000) average regularization: 4.649394213629421e-06\n",
      "[proc 1][Train] 1 steps take 2.640 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.431, backward: 0.069, update: 2.138\n",
      "[proc 0][Train](139/100000) average pos_loss: 0.9871004819869995\n",
      "[proc 0][Train](139/100000) average neg_loss: 0.32194703817367554\n",
      "[proc 0][Train](139/100000) average loss: 0.6545237302780151\n",
      "[proc 0][Train](139/100000) average regularization: 4.5268566282175016e-06\n",
      "[proc 0][Train] 1 steps take 2.651 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.436, backward: 0.070, update: 2.144\n",
      "[proc 1][Train](140/100000) average pos_loss: 1.012662410736084\n",
      "[proc 1][Train](140/100000) average neg_loss: 0.60912024974823\n",
      "[proc 1][Train](140/100000) average loss: 0.810891330242157\n",
      "[proc 1][Train](140/100000) average regularization: 4.483298653212842e-06\n",
      "[proc 1][Train] 1 steps take 2.635 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.434, backward: 0.070, update: 2.130\n",
      "[proc 0][Train](140/100000) average pos_loss: 1.0084670782089233\n",
      "[proc 0][Train](140/100000) average neg_loss: 0.5781121253967285\n",
      "[proc 0][Train](140/100000) average loss: 0.7932896018028259\n",
      "[proc 0][Train](140/100000) average regularization: 4.562302819977049e-06\n",
      "[proc 0][Train] 1 steps take 2.646 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.432, backward: 0.071, update: 2.142\n",
      "[proc 1][Train](141/100000) average pos_loss: 1.0905834436416626\n",
      "[proc 1][Train](141/100000) average neg_loss: 0.3370054364204407\n",
      "[proc 1][Train](141/100000) average loss: 0.713794469833374\n",
      "[proc 1][Train](141/100000) average regularization: 4.494711447478039e-06\n",
      "[proc 1][Train] 1 steps take 2.624 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.425, backward: 0.070, update: 2.128\n",
      "[proc 0][Train](141/100000) average pos_loss: 1.02421236038208\n",
      "[proc 0][Train](141/100000) average neg_loss: 0.3242117762565613\n",
      "[proc 0][Train](141/100000) average loss: 0.6742120981216431\n",
      "[proc 0][Train](141/100000) average regularization: 4.711635938292602e-06\n",
      "[proc 0][Train] 1 steps take 2.672 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.439, backward: 0.070, update: 2.161\n",
      "[proc 1][Train](142/100000) average pos_loss: 0.9702996611595154\n",
      "[proc 1][Train](142/100000) average neg_loss: 0.6036814451217651\n",
      "[proc 1][Train](142/100000) average loss: 0.7869905233383179\n",
      "[proc 1][Train](142/100000) average regularization: 4.843426722800359e-06\n",
      "[proc 1][Train] 1 steps take 2.665 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.431, backward: 0.070, update: 2.163\n",
      "[proc 0][Train](142/100000) average pos_loss: 1.0153748989105225\n",
      "[proc 0][Train](142/100000) average neg_loss: 0.5362798571586609\n",
      "[proc 0][Train](142/100000) average loss: 0.7758274078369141\n",
      "[proc 0][Train](142/100000) average regularization: 4.465405709197512e-06\n",
      "[proc 0][Train] 1 steps take 2.725 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.429, backward: 0.070, update: 2.225\n",
      "[proc 1][Train](143/100000) average pos_loss: 1.0269064903259277\n",
      "[proc 1][Train](143/100000) average neg_loss: 0.32138538360595703\n",
      "[proc 1][Train](143/100000) average loss: 0.6741459369659424\n",
      "[proc 1][Train](143/100000) average regularization: 4.59625334769953e-06\n",
      "[proc 1][Train] 1 steps take 2.682 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.432, backward: 0.070, update: 2.179\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](143/100000) average pos_loss: 0.9675946235656738\n",
      "[proc 0][Train](143/100000) average neg_loss: 0.352138876914978\n",
      "[proc 0][Train](143/100000) average loss: 0.6598667502403259\n",
      "[proc 0][Train](143/100000) average regularization: 4.434364655026002e-06\n",
      "[proc 0][Train] 1 steps take 2.659 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.414, backward: 0.071, update: 2.174\n",
      "[proc 1][Train](144/100000) average pos_loss: 0.9976010322570801\n",
      "[proc 1][Train](144/100000) average neg_loss: 0.5486695170402527\n",
      "[proc 1][Train](144/100000) average loss: 0.7731353044509888\n",
      "[proc 1][Train](144/100000) average regularization: 4.556133717414923e-06\n",
      "[proc 1][Train] 1 steps take 2.668 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.427, backward: 0.069, update: 2.171\n",
      "[proc 0][Train](144/100000) average pos_loss: 0.9982150793075562\n",
      "[proc 0][Train](144/100000) average neg_loss: 0.5439677238464355\n",
      "[proc 0][Train](144/100000) average loss: 0.7710914015769958\n",
      "[proc 0][Train](144/100000) average regularization: 4.665713731810683e-06\n",
      "[proc 0][Train] 1 steps take 2.680 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.449, backward: 0.070, update: 2.160\n",
      "[proc 1][Train](145/100000) average pos_loss: 0.9894456267356873\n",
      "[proc 1][Train](145/100000) average neg_loss: 0.34139174222946167\n",
      "[proc 1][Train](145/100000) average loss: 0.6654186844825745\n",
      "[proc 1][Train](145/100000) average regularization: 4.387506578495959e-06\n",
      "[proc 1][Train] 1 steps take 2.675 seconds\n",
      "[proc 1]sample: 0.017, forward: 0.434, backward: 0.069, update: 2.156\n",
      "[proc 0][Train](145/100000) average pos_loss: 0.9918123483657837\n",
      "[proc 0][Train](145/100000) average neg_loss: 0.3302062749862671\n",
      "[proc 0][Train](145/100000) average loss: 0.6610093116760254\n",
      "[proc 0][Train](145/100000) average regularization: 4.590955995809054e-06\n",
      "[proc 0][Train] 1 steps take 2.614 seconds\n",
      "[proc 0]sample: 0.014, forward: 0.402, backward: 0.070, update: 2.127\n",
      "[proc 1][Train](146/100000) average pos_loss: 0.9982311725616455\n",
      "[proc 1][Train](146/100000) average neg_loss: 0.5807638764381409\n",
      "[proc 1][Train](146/100000) average loss: 0.7894974946975708\n",
      "[proc 1][Train](146/100000) average regularization: 4.591280230670236e-06\n",
      "[proc 1][Train] 1 steps take 2.680 seconds\n",
      "[proc 1]sample: 0.019, forward: 0.426, backward: 0.070, update: 2.164\n",
      "[proc 0][Train](146/100000) average pos_loss: 1.0043046474456787\n",
      "[proc 0][Train](146/100000) average neg_loss: 0.5335531830787659\n",
      "[proc 0][Train](146/100000) average loss: 0.7689288854598999\n",
      "[proc 0][Train](146/100000) average regularization: 4.61838862975128e-06\n",
      "[proc 0][Train] 1 steps take 2.687 seconds\n",
      "[proc 0]sample: 0.014, forward: 0.425, backward: 0.070, update: 2.177\n",
      "[proc 1][Train](147/100000) average pos_loss: 1.008786916732788\n",
      "[proc 1][Train](147/100000) average neg_loss: 0.3223685026168823\n",
      "[proc 1][Train](147/100000) average loss: 0.6655777096748352\n",
      "[proc 1][Train](147/100000) average regularization: 4.594218808051664e-06\n",
      "[proc 1][Train] 1 steps take 2.734 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.477, backward: 0.069, update: 2.186\n",
      "[proc 0][Train](147/100000) average pos_loss: 0.9647863507270813\n",
      "[proc 0][Train](147/100000) average neg_loss: 0.34266194701194763\n",
      "[proc 0][Train](147/100000) average loss: 0.6537241339683533\n",
      "[proc 0][Train](147/100000) average regularization: 4.762115622725105e-06\n",
      "[proc 0][Train] 1 steps take 2.620 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.439, backward: 0.070, update: 2.110\n",
      "[proc 1][Train](148/100000) average pos_loss: 0.9680098295211792\n",
      "[proc 1][Train](148/100000) average neg_loss: 0.5393322110176086\n",
      "[proc 1][Train](148/100000) average loss: 0.7536710500717163\n",
      "[proc 1][Train](148/100000) average regularization: 4.617191279976396e-06\n",
      "[proc 1][Train] 1 steps take 2.707 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.477, backward: 0.070, update: 2.158\n",
      "[proc 0][Train](148/100000) average pos_loss: 1.007670283317566\n",
      "[proc 0][Train](148/100000) average neg_loss: 0.5773674249649048\n",
      "[proc 0][Train](148/100000) average loss: 0.7925188541412354\n",
      "[proc 0][Train](148/100000) average regularization: 4.869819349551108e-06\n",
      "[proc 0][Train] 1 steps take 2.666 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.427, backward: 0.070, update: 2.168\n",
      "[proc 1][Train](149/100000) average pos_loss: 1.0186102390289307\n",
      "[proc 1][Train](149/100000) average neg_loss: 0.3241587281227112\n",
      "[proc 1][Train](149/100000) average loss: 0.6713844537734985\n",
      "[proc 1][Train](149/100000) average regularization: 4.522033577814e-06\n",
      "[proc 1][Train] 1 steps take 2.686 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.434, backward: 0.070, update: 2.180\n",
      "[proc 0][Train](149/100000) average pos_loss: 0.9985440969467163\n",
      "[proc 0][Train](149/100000) average neg_loss: 0.3223802447319031\n",
      "[proc 0][Train](149/100000) average loss: 0.6604621410369873\n",
      "[proc 0][Train](149/100000) average regularization: 4.652622465073364e-06\n",
      "[proc 0][Train] 1 steps take 2.646 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.456, backward: 0.070, update: 2.118\n",
      "[proc 1][Train](150/100000) average pos_loss: 0.8973408937454224\n",
      "[proc 1][Train](150/100000) average neg_loss: 0.5498813390731812\n",
      "[proc 1][Train](150/100000) average loss: 0.7236111164093018\n",
      "[proc 1][Train](150/100000) average regularization: 4.833910224988358e-06\n",
      "[proc 1][Train] 1 steps take 2.673 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.431, backward: 0.070, update: 2.170\n",
      "[proc 0][Train](150/100000) average pos_loss: 0.9665912389755249\n",
      "[proc 0][Train](150/100000) average neg_loss: 0.5159035921096802\n",
      "[proc 0][Train](150/100000) average loss: 0.7412474155426025\n",
      "[proc 0][Train](150/100000) average regularization: 4.621665084414417e-06\n",
      "[proc 0][Train] 1 steps take 2.640 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.436, backward: 0.070, update: 2.132\n",
      "[proc 1][Train](151/100000) average pos_loss: 1.0127577781677246\n",
      "[proc 1][Train](151/100000) average neg_loss: 0.2921312153339386\n",
      "[proc 1][Train](151/100000) average loss: 0.6524444818496704\n",
      "[proc 1][Train](151/100000) average regularization: 4.806458491657395e-06\n",
      "[proc 1][Train] 1 steps take 2.618 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.427, backward: 0.071, update: 2.120\n",
      "[proc 0][Train](151/100000) average pos_loss: 0.8894164562225342\n",
      "[proc 0][Train](151/100000) average neg_loss: 0.37850576639175415\n",
      "[proc 0][Train](151/100000) average loss: 0.6339610815048218\n",
      "[proc 0][Train](151/100000) average regularization: 4.693884875450749e-06\n",
      "[proc 0][Train] 1 steps take 2.595 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.435, backward: 0.069, update: 2.089\n",
      "[proc 1][Train](152/100000) average pos_loss: 0.9241646528244019\n",
      "[proc 1][Train](152/100000) average neg_loss: 0.5655404329299927\n",
      "[proc 1][Train](152/100000) average loss: 0.7448525428771973\n",
      "[proc 1][Train](152/100000) average regularization: 4.865149549004855e-06\n",
      "[proc 1][Train] 1 steps take 2.582 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.421, backward: 0.069, update: 2.090\n",
      "[proc 0][Train](152/100000) average pos_loss: 0.980701208114624\n",
      "[proc 0][Train](152/100000) average neg_loss: 0.5361067652702332\n",
      "[proc 0][Train](152/100000) average loss: 0.758404016494751\n",
      "[proc 0][Train](152/100000) average regularization: 4.722095127362991e-06\n",
      "[proc 0][Train] 1 steps take 2.657 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.433, backward: 0.069, update: 2.153\n",
      "[proc 1][Train](153/100000) average pos_loss: 1.003892421722412\n",
      "[proc 1][Train](153/100000) average neg_loss: 0.3343942165374756\n",
      "[proc 1][Train](153/100000) average loss: 0.6691433191299438\n",
      "[proc 1][Train](153/100000) average regularization: 4.740580152429175e-06\n",
      "[proc 1][Train] 1 steps take 2.596 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.424, backward: 0.070, update: 2.100\n",
      "[proc 0][Train](153/100000) average pos_loss: 0.9682337045669556\n",
      "[proc 0][Train](153/100000) average neg_loss: 0.322567880153656\n",
      "[proc 0][Train](153/100000) average loss: 0.6454007625579834\n",
      "[proc 0][Train](153/100000) average regularization: 4.8641245484759565e-06\n",
      "[proc 0][Train] 1 steps take 2.580 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.406, backward: 0.069, update: 2.104\n",
      "[proc 1][Train](154/100000) average pos_loss: 0.8939211368560791\n",
      "[proc 1][Train](154/100000) average neg_loss: 0.5672293901443481\n",
      "[proc 1][Train](154/100000) average loss: 0.7305752635002136\n",
      "[proc 1][Train](154/100000) average regularization: 4.724549853563076e-06\n",
      "[proc 1][Train] 1 steps take 2.592 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.425, backward: 0.069, update: 2.096\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](154/100000) average pos_loss: 0.9659552574157715\n",
      "[proc 0][Train](154/100000) average neg_loss: 0.5303310751914978\n",
      "[proc 0][Train](154/100000) average loss: 0.748143196105957\n",
      "[proc 0][Train](154/100000) average regularization: 4.762612661579624e-06\n",
      "[proc 0][Train] 1 steps take 2.741 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.435, backward: 0.069, update: 2.236\n",
      "[proc 1][Train](155/100000) average pos_loss: 0.9691283702850342\n",
      "[proc 1][Train](155/100000) average neg_loss: 0.34755581617355347\n",
      "[proc 1][Train](155/100000) average loss: 0.6583421230316162\n",
      "[proc 1][Train](155/100000) average regularization: 4.9209043027076405e-06\n",
      "[proc 1][Train] 1 steps take 2.587 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.421, backward: 0.069, update: 2.095\n",
      "[proc 0][Train](155/100000) average pos_loss: 0.9421359300613403\n",
      "[proc 0][Train](155/100000) average neg_loss: 0.34409099817276\n",
      "[proc 0][Train](155/100000) average loss: 0.6431134939193726\n",
      "[proc 0][Train](155/100000) average regularization: 4.846644060307881e-06\n",
      "[proc 0][Train] 1 steps take 2.625 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.422, backward: 0.069, update: 2.132\n",
      "[proc 1][Train](156/100000) average pos_loss: 1.0040335655212402\n",
      "[proc 1][Train](156/100000) average neg_loss: 0.5429617166519165\n",
      "[proc 1][Train](156/100000) average loss: 0.7734976410865784\n",
      "[proc 1][Train](156/100000) average regularization: 4.830144462175667e-06\n",
      "[proc 1][Train] 1 steps take 2.654 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.441, backward: 0.070, update: 2.142\n",
      "[proc 0][Train](156/100000) average pos_loss: 0.9691437482833862\n",
      "[proc 0][Train](156/100000) average neg_loss: 0.5330305695533752\n",
      "[proc 0][Train](156/100000) average loss: 0.7510871887207031\n",
      "[proc 0][Train](156/100000) average regularization: 4.815859483642271e-06\n",
      "[proc 0][Train] 1 steps take 2.630 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.434, backward: 0.069, update: 2.125\n",
      "[proc 1][Train](157/100000) average pos_loss: 0.9411332607269287\n",
      "[proc 1][Train](157/100000) average neg_loss: 0.3277861475944519\n",
      "[proc 1][Train](157/100000) average loss: 0.6344597339630127\n",
      "[proc 1][Train](157/100000) average regularization: 4.977119715476874e-06\n",
      "[proc 1][Train] 1 steps take 2.616 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.410, backward: 0.070, update: 2.135\n",
      "[proc 0][Train](157/100000) average pos_loss: 0.946590781211853\n",
      "[proc 0][Train](157/100000) average neg_loss: 0.3212367594242096\n",
      "[proc 0][Train](157/100000) average loss: 0.6339137554168701\n",
      "[proc 0][Train](157/100000) average regularization: 4.8413303375127725e-06\n",
      "[proc 0][Train] 1 steps take 2.667 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.436, backward: 0.071, update: 2.158\n",
      "[proc 1][Train](158/100000) average pos_loss: 0.997554361820221\n",
      "[proc 1][Train](158/100000) average neg_loss: 0.5910240411758423\n",
      "[proc 1][Train](158/100000) average loss: 0.794289231300354\n",
      "[proc 1][Train](158/100000) average regularization: 4.963568244420458e-06\n",
      "[proc 1][Train] 1 steps take 2.675 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.447, backward: 0.070, update: 2.156\n",
      "[proc 0][Train](158/100000) average pos_loss: 0.9609420299530029\n",
      "[proc 0][Train](158/100000) average neg_loss: 0.5673854947090149\n",
      "[proc 0][Train](158/100000) average loss: 0.7641637325286865\n",
      "[proc 0][Train](158/100000) average regularization: 4.766012807522202e-06\n",
      "[proc 0][Train] 1 steps take 2.602 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.428, backward: 0.070, update: 2.102\n",
      "[proc 1][Train](159/100000) average pos_loss: 0.9243203997612\n",
      "[proc 1][Train](159/100000) average neg_loss: 0.32703596353530884\n",
      "[proc 1][Train](159/100000) average loss: 0.6256781816482544\n",
      "[proc 1][Train](159/100000) average regularization: 4.803071078640642e-06\n",
      "[proc 1][Train] 1 steps take 2.661 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.425, backward: 0.070, update: 2.165\n",
      "[proc 0][Train](159/100000) average pos_loss: 0.8984785676002502\n",
      "[proc 0][Train](159/100000) average neg_loss: 0.3371819257736206\n",
      "[proc 0][Train](159/100000) average loss: 0.6178302764892578\n",
      "[proc 0][Train](159/100000) average regularization: 4.898187853541458e-06\n",
      "[proc 0][Train] 1 steps take 2.568 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.398, backward: 0.069, update: 2.099\n",
      "[proc 1][Train](160/100000) average pos_loss: 0.9496393203735352\n",
      "[proc 1][Train](160/100000) average neg_loss: 0.5770683884620667\n",
      "[proc 1][Train](160/100000) average loss: 0.7633538246154785\n",
      "[proc 1][Train](160/100000) average regularization: 4.939654445479391e-06\n",
      "[proc 1][Train] 1 steps take 2.509 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.413, backward: 0.069, update: 2.025\n",
      "[proc 0][Train](160/100000) average pos_loss: 0.9407307505607605\n",
      "[proc 0][Train](160/100000) average neg_loss: 0.5595275163650513\n",
      "[proc 0][Train](160/100000) average loss: 0.7501291036605835\n",
      "[proc 0][Train](160/100000) average regularization: 5.0025164455291815e-06\n",
      "[proc 0][Train] 1 steps take 2.590 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.395, backward: 0.069, update: 2.125\n",
      "[proc 1][Train](161/100000) average pos_loss: 0.9500634074211121\n",
      "[proc 1][Train](161/100000) average neg_loss: 0.3539947271347046\n",
      "[proc 1][Train](161/100000) average loss: 0.6520290374755859\n",
      "[proc 1][Train](161/100000) average regularization: 5.200546638661763e-06\n",
      "[proc 1][Train] 1 steps take 2.701 seconds\n",
      "[proc 1]sample: 0.017, forward: 0.476, backward: 0.070, update: 2.137\n",
      "[proc 0][Train](161/100000) average pos_loss: 0.9398197531700134\n",
      "[proc 0][Train](161/100000) average neg_loss: 0.3628438115119934\n",
      "[proc 0][Train](161/100000) average loss: 0.6513317823410034\n",
      "[proc 0][Train](161/100000) average regularization: 4.80152084492147e-06\n",
      "[proc 0][Train] 1 steps take 2.634 seconds\n",
      "[proc 0]sample: 0.017, forward: 0.432, backward: 0.070, update: 2.116\n",
      "[proc 1][Train](162/100000) average pos_loss: 0.9498426914215088\n",
      "[proc 1][Train](162/100000) average neg_loss: 0.5477343797683716\n",
      "[proc 1][Train](162/100000) average loss: 0.7487885355949402\n",
      "[proc 1][Train](162/100000) average regularization: 4.968844223185442e-06\n",
      "[proc 1][Train] 1 steps take 2.693 seconds\n",
      "[proc 1]sample: 0.017, forward: 0.435, backward: 0.070, update: 2.170\n",
      "[proc 0][Train](162/100000) average pos_loss: 0.9283751249313354\n",
      "[proc 0][Train](162/100000) average neg_loss: 0.5866272449493408\n",
      "[proc 0][Train](162/100000) average loss: 0.7575011849403381\n",
      "[proc 0][Train](162/100000) average regularization: 4.9462896640761755e-06\n",
      "[proc 0][Train] 1 steps take 2.660 seconds\n",
      "[proc 0]sample: 0.017, forward: 0.456, backward: 0.070, update: 2.117\n",
      "[proc 1][Train](163/100000) average pos_loss: 0.899307370185852\n",
      "[proc 1][Train](163/100000) average neg_loss: 0.29340827465057373\n",
      "[proc 1][Train](163/100000) average loss: 0.5963578224182129\n",
      "[proc 1][Train](163/100000) average regularization: 5.041077656642301e-06\n",
      "[proc 1][Train] 1 steps take 2.609 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.426, backward: 0.070, update: 2.112\n",
      "[proc 0][Train](163/100000) average pos_loss: 0.9403197765350342\n",
      "[proc 0][Train](163/100000) average neg_loss: 0.33904775977134705\n",
      "[proc 0][Train](163/100000) average loss: 0.6396837830543518\n",
      "[proc 0][Train](163/100000) average regularization: 4.79897653349326e-06\n",
      "[proc 0][Train] 1 steps take 2.737 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.451, backward: 0.069, update: 2.216\n",
      "[proc 1][Train](164/100000) average pos_loss: 0.9116578102111816\n",
      "[proc 1][Train](164/100000) average neg_loss: 0.5512678623199463\n",
      "[proc 1][Train](164/100000) average loss: 0.731462836265564\n",
      "[proc 1][Train](164/100000) average regularization: 4.957303644914646e-06\n",
      "[proc 1][Train] 1 steps take 2.619 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.408, backward: 0.069, update: 2.141\n",
      "[proc 0][Train](164/100000) average pos_loss: 0.8397068977355957\n",
      "[proc 0][Train](164/100000) average neg_loss: 0.5829511880874634\n",
      "[proc 0][Train](164/100000) average loss: 0.7113290429115295\n",
      "[proc 0][Train](164/100000) average regularization: 4.778473794431193e-06\n",
      "[proc 0][Train] 1 steps take 2.645 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.437, backward: 0.070, update: 2.138\n",
      "[proc 1][Train](165/100000) average pos_loss: 0.9139641523361206\n",
      "[proc 1][Train](165/100000) average neg_loss: 0.3411265015602112\n",
      "[proc 1][Train](165/100000) average loss: 0.6275453567504883\n",
      "[proc 1][Train](165/100000) average regularization: 4.847437594435178e-06\n",
      "[proc 1][Train] 1 steps take 2.617 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.429, backward: 0.071, update: 2.116\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](165/100000) average pos_loss: 0.9140968322753906\n",
      "[proc 0][Train](165/100000) average neg_loss: 0.3183598518371582\n",
      "[proc 0][Train](165/100000) average loss: 0.6162283420562744\n",
      "[proc 0][Train](165/100000) average regularization: 5.017063358536689e-06\n",
      "[proc 0][Train] 1 steps take 2.686 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.431, backward: 0.069, update: 2.184\n",
      "[proc 1][Train](166/100000) average pos_loss: 0.9356465339660645\n",
      "[proc 1][Train](166/100000) average neg_loss: 0.5482965707778931\n",
      "[proc 1][Train](166/100000) average loss: 0.7419715523719788\n",
      "[proc 1][Train](166/100000) average regularization: 4.974972398485988e-06\n",
      "[proc 1][Train] 1 steps take 2.692 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.437, backward: 0.070, update: 2.184\n",
      "[proc 0][Train](166/100000) average pos_loss: 0.961479663848877\n",
      "[proc 0][Train](166/100000) average neg_loss: 0.5540453791618347\n",
      "[proc 0][Train](166/100000) average loss: 0.7577625513076782\n",
      "[proc 0][Train](166/100000) average regularization: 5.112545750307618e-06\n",
      "[proc 0][Train] 1 steps take 2.701 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.432, backward: 0.070, update: 2.197\n",
      "[proc 1][Train](167/100000) average pos_loss: 0.903032660484314\n",
      "[proc 1][Train](167/100000) average neg_loss: 0.3247823119163513\n",
      "[proc 1][Train](167/100000) average loss: 0.6139074563980103\n",
      "[proc 1][Train](167/100000) average regularization: 5.14547582497471e-06\n",
      "[proc 1][Train] 1 steps take 2.668 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.437, backward: 0.070, update: 2.160\n",
      "[proc 0][Train](167/100000) average pos_loss: 0.8948678970336914\n",
      "[proc 0][Train](167/100000) average neg_loss: 0.312730073928833\n",
      "[proc 0][Train](167/100000) average loss: 0.6037989854812622\n",
      "[proc 0][Train](167/100000) average regularization: 4.875146714766743e-06\n",
      "[proc 0][Train] 1 steps take 2.678 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.430, backward: 0.070, update: 2.176\n",
      "[proc 1][Train](168/100000) average pos_loss: 0.8938635587692261\n",
      "[proc 1][Train](168/100000) average neg_loss: 0.5467694997787476\n",
      "[proc 1][Train](168/100000) average loss: 0.7203165292739868\n",
      "[proc 1][Train](168/100000) average regularization: 4.9890986701939255e-06\n",
      "[proc 1][Train] 1 steps take 2.747 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.430, backward: 0.070, update: 2.245\n",
      "[proc 0][Train](168/100000) average pos_loss: 0.9064632654190063\n",
      "[proc 0][Train](168/100000) average neg_loss: 0.5551785230636597\n",
      "[proc 0][Train](168/100000) average loss: 0.730820894241333\n",
      "[proc 0][Train](168/100000) average regularization: 5.042431894253241e-06\n",
      "[proc 0][Train] 1 steps take 2.822 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.525, backward: 0.070, update: 2.225\n",
      "[proc 1][Train](169/100000) average pos_loss: 0.9026515483856201\n",
      "[proc 1][Train](169/100000) average neg_loss: 0.31027916073799133\n",
      "[proc 1][Train](169/100000) average loss: 0.6064653396606445\n",
      "[proc 1][Train](169/100000) average regularization: 5.109800440550316e-06\n",
      "[proc 1][Train] 1 steps take 2.739 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.442, backward: 0.069, update: 2.226\n",
      "[proc 0][Train](169/100000) average pos_loss: 0.9045242071151733\n",
      "[proc 0][Train](169/100000) average neg_loss: 0.3455256521701813\n",
      "[proc 0][Train](169/100000) average loss: 0.6250249147415161\n",
      "[proc 0][Train](169/100000) average regularization: 4.882033408648567e-06\n",
      "[proc 0][Train] 1 steps take 2.641 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.412, backward: 0.070, update: 2.158\n",
      "[proc 1][Train](170/100000) average pos_loss: 0.9380114674568176\n",
      "[proc 1][Train](170/100000) average neg_loss: 0.5653594732284546\n",
      "[proc 1][Train](170/100000) average loss: 0.7516855001449585\n",
      "[proc 1][Train](170/100000) average regularization: 5.194741788727697e-06\n",
      "[proc 1][Train] 1 steps take 2.681 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.411, backward: 0.071, update: 2.198\n",
      "[proc 0][Train](170/100000) average pos_loss: 0.8875380754470825\n",
      "[proc 0][Train](170/100000) average neg_loss: 0.5415906310081482\n",
      "[proc 0][Train](170/100000) average loss: 0.714564323425293\n",
      "[proc 0][Train](170/100000) average regularization: 5.0494345487095416e-06\n",
      "[proc 0][Train] 1 steps take 2.586 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.420, backward: 0.069, update: 2.095\n",
      "[proc 1][Train](171/100000) average pos_loss: 0.9091758131980896\n",
      "[proc 1][Train](171/100000) average neg_loss: 0.32528233528137207\n",
      "[proc 1][Train](171/100000) average loss: 0.6172291040420532\n",
      "[proc 1][Train](171/100000) average regularization: 5.185077952774009e-06\n",
      "[proc 1][Train] 1 steps take 2.651 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.426, backward: 0.070, update: 2.153\n",
      "[proc 0][Train](171/100000) average pos_loss: 0.8778197169303894\n",
      "[proc 0][Train](171/100000) average neg_loss: 0.3220965266227722\n",
      "[proc 0][Train](171/100000) average loss: 0.5999581217765808\n",
      "[proc 0][Train](171/100000) average regularization: 5.013172994949855e-06\n",
      "[proc 0][Train] 1 steps take 2.693 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.442, backward: 0.070, update: 2.180\n",
      "[proc 1][Train](172/100000) average pos_loss: 0.9052655100822449\n",
      "[proc 1][Train](172/100000) average neg_loss: 0.5820403695106506\n",
      "[proc 1][Train](172/100000) average loss: 0.7436529397964478\n",
      "[proc 1][Train](172/100000) average regularization: 4.986759904568316e-06\n",
      "[proc 1][Train] 1 steps take 2.706 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.425, backward: 0.070, update: 2.209\n",
      "[proc 0][Train](172/100000) average pos_loss: 0.9193265438079834\n",
      "[proc 0][Train](172/100000) average neg_loss: 0.544758141040802\n",
      "[proc 0][Train](172/100000) average loss: 0.7320423126220703\n",
      "[proc 0][Train](172/100000) average regularization: 5.2038558351341635e-06\n",
      "[proc 0][Train] 1 steps take 2.688 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.437, backward: 0.070, update: 2.179\n",
      "[proc 1][Train](173/100000) average pos_loss: 0.9649981260299683\n",
      "[proc 1][Train](173/100000) average neg_loss: 0.33458203077316284\n",
      "[proc 1][Train](173/100000) average loss: 0.6497900485992432\n",
      "[proc 1][Train](173/100000) average regularization: 5.190062438487075e-06\n",
      "[proc 1][Train] 1 steps take 2.705 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.429, backward: 0.070, update: 2.205\n",
      "[proc 0][Train](173/100000) average pos_loss: 0.9144797325134277\n",
      "[proc 0][Train](173/100000) average neg_loss: 0.3100763261318207\n",
      "[proc 0][Train](173/100000) average loss: 0.6122780442237854\n",
      "[proc 0][Train](173/100000) average regularization: 5.121438789501553e-06\n",
      "[proc 0][Train] 1 steps take 2.789 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.458, backward: 0.070, update: 2.260\n",
      "[proc 1][Train](174/100000) average pos_loss: 0.8917091488838196\n",
      "[proc 1][Train](174/100000) average neg_loss: 0.5693807601928711\n",
      "[proc 1][Train](174/100000) average loss: 0.730544924736023\n",
      "[proc 1][Train](174/100000) average regularization: 5.2659102038887795e-06\n",
      "[proc 1][Train] 1 steps take 2.690 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.440, backward: 0.069, update: 2.180\n",
      "[proc 0][Train](174/100000) average pos_loss: 0.858239471912384\n",
      "[proc 0][Train](174/100000) average neg_loss: 0.5494639873504639\n",
      "[proc 0][Train](174/100000) average loss: 0.7038516998291016\n",
      "[proc 0][Train](174/100000) average regularization: 5.194805453356821e-06\n",
      "[proc 0][Train] 1 steps take 2.613 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.429, backward: 0.070, update: 2.112\n",
      "[proc 1][Train](175/100000) average pos_loss: 0.9373804330825806\n",
      "[proc 1][Train](175/100000) average neg_loss: 0.3414941430091858\n",
      "[proc 1][Train](175/100000) average loss: 0.6394373178482056\n",
      "[proc 1][Train](175/100000) average regularization: 5.014707767259097e-06\n",
      "[proc 1][Train] 1 steps take 2.689 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.424, backward: 0.070, update: 2.195\n",
      "[proc 0][Train](175/100000) average pos_loss: 0.889336347579956\n",
      "[proc 0][Train](175/100000) average neg_loss: 0.33939504623413086\n",
      "[proc 0][Train](175/100000) average loss: 0.6143656969070435\n",
      "[proc 0][Train](175/100000) average regularization: 5.204024546401342e-06\n",
      "[proc 0][Train] 1 steps take 2.665 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.436, backward: 0.069, update: 2.157\n",
      "[proc 1][Train](176/100000) average pos_loss: 0.8912898302078247\n",
      "[proc 1][Train](176/100000) average neg_loss: 0.549758791923523\n",
      "[proc 1][Train](176/100000) average loss: 0.7205243110656738\n",
      "[proc 1][Train](176/100000) average regularization: 5.292769401421538e-06\n",
      "[proc 1][Train] 1 steps take 2.602 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.407, backward: 0.069, update: 2.123\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](176/100000) average pos_loss: 0.8827727437019348\n",
      "[proc 0][Train](176/100000) average neg_loss: 0.56690913438797\n",
      "[proc 0][Train](176/100000) average loss: 0.7248409390449524\n",
      "[proc 0][Train](176/100000) average regularization: 5.23364860782749e-06\n",
      "[proc 0][Train] 1 steps take 2.625 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.433, backward: 0.070, update: 2.120\n",
      "[proc 1][Train](177/100000) average pos_loss: 0.9066434502601624\n",
      "[proc 1][Train](177/100000) average neg_loss: 0.30344367027282715\n",
      "[proc 1][Train](177/100000) average loss: 0.6050435304641724\n",
      "[proc 1][Train](177/100000) average regularization: 5.256356871541357e-06\n",
      "[proc 1][Train] 1 steps take 2.611 seconds\n",
      "[proc 1]sample: 0.017, forward: 0.426, backward: 0.069, update: 2.099\n",
      "[proc 0][Train](177/100000) average pos_loss: 0.8900147676467896\n",
      "[proc 0][Train](177/100000) average neg_loss: 0.30464738607406616\n",
      "[proc 0][Train](177/100000) average loss: 0.5973310470581055\n",
      "[proc 0][Train](177/100000) average regularization: 5.333509761840105e-06\n",
      "[proc 0][Train] 1 steps take 2.688 seconds\n",
      "[proc 0]sample: 0.016, forward: 0.436, backward: 0.070, update: 2.166\n",
      "[proc 1][Train](178/100000) average pos_loss: 0.9076611399650574\n",
      "[proc 1][Train](178/100000) average neg_loss: 0.5817192792892456\n",
      "[proc 1][Train](178/100000) average loss: 0.7446901798248291\n",
      "[proc 1][Train](178/100000) average regularization: 4.894139692623867e-06\n",
      "[proc 1][Train] 1 steps take 2.694 seconds\n",
      "[proc 1]sample: 0.014, forward: 0.432, backward: 0.071, update: 2.178\n",
      "[proc 0][Train](178/100000) average pos_loss: 0.9318721294403076\n",
      "[proc 0][Train](178/100000) average neg_loss: 0.5762553215026855\n",
      "[proc 0][Train](178/100000) average loss: 0.7540637254714966\n",
      "[proc 0][Train](178/100000) average regularization: 5.086290002509486e-06\n",
      "[proc 0][Train] 1 steps take 2.744 seconds\n",
      "[proc 0]sample: 0.018, forward: 0.435, backward: 0.070, update: 2.222\n",
      "[proc 1][Train](179/100000) average pos_loss: 0.8739310503005981\n",
      "[proc 1][Train](179/100000) average neg_loss: 0.32407140731811523\n",
      "[proc 1][Train](179/100000) average loss: 0.5990012288093567\n",
      "[proc 1][Train](179/100000) average regularization: 5.159118245501304e-06\n",
      "[proc 1][Train] 1 steps take 2.697 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.430, backward: 0.070, update: 2.196\n",
      "[proc 0][Train](179/100000) average pos_loss: 0.9018166661262512\n",
      "[proc 0][Train](179/100000) average neg_loss: 0.31686514616012573\n",
      "[proc 0][Train](179/100000) average loss: 0.6093409061431885\n",
      "[proc 0][Train](179/100000) average regularization: 5.326978225639323e-06\n",
      "[proc 0][Train] 1 steps take 2.710 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.421, backward: 0.070, update: 2.218\n",
      "[proc 1][Train](180/100000) average pos_loss: 0.8282849788665771\n",
      "[proc 1][Train](180/100000) average neg_loss: 0.5512940883636475\n",
      "[proc 1][Train](180/100000) average loss: 0.6897895336151123\n",
      "[proc 1][Train](180/100000) average regularization: 5.107029210194014e-06\n",
      "[proc 1][Train] 1 steps take 2.724 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.450, backward: 0.069, update: 2.202\n",
      "[proc 0][Train](180/100000) average pos_loss: 0.9027711153030396\n",
      "[proc 0][Train](180/100000) average neg_loss: 0.5517259240150452\n",
      "[proc 0][Train](180/100000) average loss: 0.7272485494613647\n",
      "[proc 0][Train](180/100000) average regularization: 5.021692231821362e-06\n",
      "[proc 0][Train] 1 steps take 2.706 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.434, backward: 0.069, update: 2.201\n",
      "[proc 1][Train](181/100000) average pos_loss: 0.8906748294830322\n",
      "[proc 1][Train](181/100000) average neg_loss: 0.3083508610725403\n",
      "[proc 1][Train](181/100000) average loss: 0.5995128154754639\n",
      "[proc 1][Train](181/100000) average regularization: 5.12067799718352e-06\n",
      "[proc 1][Train] 1 steps take 2.688 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.435, backward: 0.071, update: 2.181\n",
      "[proc 0][Train](181/100000) average pos_loss: 0.8782939314842224\n",
      "[proc 0][Train](181/100000) average neg_loss: 0.31737202405929565\n",
      "[proc 0][Train](181/100000) average loss: 0.597832977771759\n",
      "[proc 0][Train](181/100000) average regularization: 5.3001158448751085e-06\n",
      "[proc 0][Train] 1 steps take 2.677 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.435, backward: 0.070, update: 2.171\n",
      "[proc 1][Train](182/100000) average pos_loss: 0.8664661645889282\n",
      "[proc 1][Train](182/100000) average neg_loss: 0.5936906337738037\n",
      "[proc 1][Train](182/100000) average loss: 0.730078399181366\n",
      "[proc 1][Train](182/100000) average regularization: 5.025612153986003e-06\n",
      "[proc 1][Train] 1 steps take 2.981 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.431, backward: 0.069, update: 2.479\n",
      "[proc 0][Train](182/100000) average pos_loss: 0.8499009609222412\n",
      "[proc 0][Train](182/100000) average neg_loss: 0.5321031808853149\n",
      "[proc 0][Train](182/100000) average loss: 0.6910020709037781\n",
      "[proc 0][Train](182/100000) average regularization: 5.207875801716e-06\n",
      "[proc 0][Train] 1 steps take 2.636 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.406, backward: 0.069, update: 2.160\n",
      "[proc 1][Train](183/100000) average pos_loss: 0.8711923360824585\n",
      "[proc 1][Train](183/100000) average neg_loss: 0.32608333230018616\n",
      "[proc 1][Train](183/100000) average loss: 0.5986378192901611\n",
      "[proc 1][Train](183/100000) average regularization: 5.2685777518490795e-06\n",
      "[proc 1][Train] 1 steps take 2.563 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.411, backward: 0.070, update: 2.082\n",
      "[proc 0][Train](183/100000) average pos_loss: 0.8381618857383728\n",
      "[proc 0][Train](183/100000) average neg_loss: 0.3119156062602997\n",
      "[proc 0][Train](183/100000) average loss: 0.575038731098175\n",
      "[proc 0][Train](183/100000) average regularization: 5.193250217416789e-06\n",
      "[proc 0][Train] 1 steps take 2.649 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.425, backward: 0.069, update: 2.154\n",
      "[proc 1][Train](184/100000) average pos_loss: 0.8764714002609253\n",
      "[proc 1][Train](184/100000) average neg_loss: 0.5800670981407166\n",
      "[proc 1][Train](184/100000) average loss: 0.7282692193984985\n",
      "[proc 1][Train](184/100000) average regularization: 5.203371529205469e-06\n",
      "[proc 1][Train] 1 steps take 2.670 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.414, backward: 0.070, update: 2.185\n",
      "[proc 0][Train](184/100000) average pos_loss: 0.8436034917831421\n",
      "[proc 0][Train](184/100000) average neg_loss: 0.5684260129928589\n",
      "[proc 0][Train](184/100000) average loss: 0.7060147523880005\n",
      "[proc 0][Train](184/100000) average regularization: 5.068146947451169e-06\n",
      "[proc 0][Train] 1 steps take 2.573 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.393, backward: 0.070, update: 2.108\n",
      "[proc 1][Train](185/100000) average pos_loss: 0.8946423530578613\n",
      "[proc 1][Train](185/100000) average neg_loss: 0.3184477388858795\n",
      "[proc 1][Train](185/100000) average loss: 0.6065450310707092\n",
      "[proc 1][Train](185/100000) average regularization: 5.346986199583625e-06\n",
      "[proc 1][Train] 1 steps take 2.569 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.424, backward: 0.070, update: 2.075\n",
      "[proc 0][Train](185/100000) average pos_loss: 0.8364296555519104\n",
      "[proc 0][Train](185/100000) average neg_loss: 0.3164782226085663\n",
      "[proc 0][Train](185/100000) average loss: 0.5764539241790771\n",
      "[proc 0][Train](185/100000) average regularization: 4.985266059520654e-06\n",
      "[proc 0][Train] 1 steps take 2.648 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.436, backward: 0.070, update: 2.140\n",
      "[proc 1][Train](186/100000) average pos_loss: 0.8161070942878723\n",
      "[proc 1][Train](186/100000) average neg_loss: 0.5815290212631226\n",
      "[proc 1][Train](186/100000) average loss: 0.6988180875778198\n",
      "[proc 1][Train](186/100000) average regularization: 5.288196462061023e-06\n",
      "[proc 1][Train] 1 steps take 2.607 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.420, backward: 0.070, update: 2.115\n",
      "[proc 0][Train](186/100000) average pos_loss: 0.818983256816864\n",
      "[proc 0][Train](186/100000) average neg_loss: 0.5562499165534973\n",
      "[proc 0][Train](186/100000) average loss: 0.6876165866851807\n",
      "[proc 0][Train](186/100000) average regularization: 5.296184554026695e-06\n",
      "[proc 0][Train] 1 steps take 2.674 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.437, backward: 0.069, update: 2.165\n",
      "[proc 1][Train](187/100000) average pos_loss: 0.8918595314025879\n",
      "[proc 1][Train](187/100000) average neg_loss: 0.337239533662796\n",
      "[proc 1][Train](187/100000) average loss: 0.6145495176315308\n",
      "[proc 1][Train](187/100000) average regularization: 5.245072316029109e-06\n",
      "[proc 1][Train] 1 steps take 2.628 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.412, backward: 0.070, update: 2.145\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](187/100000) average pos_loss: 0.8975009918212891\n",
      "[proc 0][Train](187/100000) average neg_loss: 0.3395116329193115\n",
      "[proc 0][Train](187/100000) average loss: 0.6185063123703003\n",
      "[proc 0][Train](187/100000) average regularization: 5.292592049954692e-06\n",
      "[proc 0][Train] 1 steps take 2.623 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.427, backward: 0.070, update: 2.125\n",
      "[proc 1][Train](188/100000) average pos_loss: 0.9098901748657227\n",
      "[proc 1][Train](188/100000) average neg_loss: 0.5590108633041382\n",
      "[proc 1][Train](188/100000) average loss: 0.7344505190849304\n",
      "[proc 1][Train](188/100000) average regularization: 5.41566259926185e-06\n",
      "[proc 1][Train] 1 steps take 2.653 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.427, backward: 0.071, update: 2.154\n",
      "[proc 0][Train](188/100000) average pos_loss: 0.8687083125114441\n",
      "[proc 0][Train](188/100000) average neg_loss: 0.6011697053909302\n",
      "[proc 0][Train](188/100000) average loss: 0.7349389791488647\n",
      "[proc 0][Train](188/100000) average regularization: 5.471463737194426e-06\n",
      "[proc 0][Train] 1 steps take 2.712 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.478, backward: 0.071, update: 2.162\n",
      "[proc 1][Train](189/100000) average pos_loss: 0.8768444061279297\n",
      "[proc 1][Train](189/100000) average neg_loss: 0.3255469799041748\n",
      "[proc 1][Train](189/100000) average loss: 0.6011956930160522\n",
      "[proc 1][Train](189/100000) average regularization: 5.301762485032668e-06\n",
      "[proc 1][Train] 1 steps take 2.678 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.449, backward: 0.069, update: 2.159\n",
      "[proc 0][Train](189/100000) average pos_loss: 0.8170645833015442\n",
      "[proc 0][Train](189/100000) average neg_loss: 0.30209842324256897\n",
      "[proc 0][Train](189/100000) average loss: 0.5595815181732178\n",
      "[proc 0][Train](189/100000) average regularization: 5.610536845779279e-06\n",
      "[proc 0][Train] 1 steps take 2.643 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.433, backward: 0.070, update: 2.139\n",
      "[proc 1][Train](190/100000) average pos_loss: 0.8325417637825012\n",
      "[proc 1][Train](190/100000) average neg_loss: 0.5789158344268799\n",
      "[proc 1][Train](190/100000) average loss: 0.7057287693023682\n",
      "[proc 1][Train](190/100000) average regularization: 5.409453478932846e-06\n",
      "[proc 1][Train] 1 steps take 2.658 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.437, backward: 0.070, update: 2.150\n",
      "[proc 0][Train](190/100000) average pos_loss: 0.8430367112159729\n",
      "[proc 0][Train](190/100000) average neg_loss: 0.5412497520446777\n",
      "[proc 0][Train](190/100000) average loss: 0.6921432018280029\n",
      "[proc 0][Train](190/100000) average regularization: 5.245999091130216e-06\n",
      "[proc 0][Train] 1 steps take 2.593 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.420, backward: 0.070, update: 2.101\n",
      "[proc 1][Train](191/100000) average pos_loss: 0.8894975781440735\n",
      "[proc 1][Train](191/100000) average neg_loss: 0.30508852005004883\n",
      "[proc 1][Train](191/100000) average loss: 0.5972930192947388\n",
      "[proc 1][Train](191/100000) average regularization: 5.336779395292979e-06\n",
      "[proc 1][Train] 1 steps take 2.656 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.431, backward: 0.070, update: 2.154\n",
      "[proc 0][Train](191/100000) average pos_loss: 0.8897379636764526\n",
      "[proc 0][Train](191/100000) average neg_loss: 0.3317161798477173\n",
      "[proc 0][Train](191/100000) average loss: 0.610727071762085\n",
      "[proc 0][Train](191/100000) average regularization: 5.23785502082319e-06\n",
      "[proc 0][Train] 1 steps take 2.639 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.445, backward: 0.070, update: 2.123\n",
      "[proc 1][Train](192/100000) average pos_loss: 0.8309033513069153\n",
      "[proc 1][Train](192/100000) average neg_loss: 0.5527180433273315\n",
      "[proc 1][Train](192/100000) average loss: 0.6918107271194458\n",
      "[proc 1][Train](192/100000) average regularization: 5.23582366440678e-06\n",
      "[proc 1][Train] 1 steps take 2.702 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.427, backward: 0.069, update: 2.204\n",
      "[proc 0][Train](192/100000) average pos_loss: 0.9104425311088562\n",
      "[proc 0][Train](192/100000) average neg_loss: 0.535386323928833\n",
      "[proc 0][Train](192/100000) average loss: 0.722914457321167\n",
      "[proc 0][Train](192/100000) average regularization: 5.4535275921807624e-06\n",
      "[proc 0][Train] 1 steps take 2.674 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.439, backward: 0.070, update: 2.163\n",
      "[proc 1][Train](193/100000) average pos_loss: 0.8694028854370117\n",
      "[proc 1][Train](193/100000) average neg_loss: 0.3274928033351898\n",
      "[proc 1][Train](193/100000) average loss: 0.598447859287262\n",
      "[proc 1][Train](193/100000) average regularization: 5.488698661793023e-06\n",
      "[proc 1][Train] 1 steps take 2.642 seconds\n",
      "[proc 1]sample: 0.015, forward: 0.429, backward: 0.070, update: 2.128\n",
      "[proc 0][Train](193/100000) average pos_loss: 0.8582608699798584\n",
      "[proc 0][Train](193/100000) average neg_loss: 0.3234233558177948\n",
      "[proc 0][Train](193/100000) average loss: 0.5908421277999878\n",
      "[proc 0][Train](193/100000) average regularization: 5.4249335335043725e-06\n",
      "[proc 0][Train] 1 steps take 2.670 seconds\n",
      "[proc 0]sample: 0.017, forward: 0.433, backward: 0.070, update: 2.150\n",
      "[proc 1][Train](194/100000) average pos_loss: 0.8025927543640137\n",
      "[proc 1][Train](194/100000) average neg_loss: 0.5632866024971008\n",
      "[proc 1][Train](194/100000) average loss: 0.6829396486282349\n",
      "[proc 1][Train](194/100000) average regularization: 5.3948215281707235e-06\n",
      "[proc 1][Train] 1 steps take 2.638 seconds\n",
      "[proc 1]sample: 0.014, forward: 0.455, backward: 0.070, update: 2.099\n",
      "[proc 0][Train](194/100000) average pos_loss: 0.8625021576881409\n",
      "[proc 0][Train](194/100000) average neg_loss: 0.5719746351242065\n",
      "[proc 0][Train](194/100000) average loss: 0.7172384262084961\n",
      "[proc 0][Train](194/100000) average regularization: 5.516078090295196e-06\n",
      "[proc 0][Train] 1 steps take 2.686 seconds\n",
      "[proc 0]sample: 0.017, forward: 0.438, backward: 0.070, update: 2.161\n",
      "[proc 1][Train](195/100000) average pos_loss: 0.8692117929458618\n",
      "[proc 1][Train](195/100000) average neg_loss: 0.34524649381637573\n",
      "[proc 1][Train](195/100000) average loss: 0.6072291135787964\n",
      "[proc 1][Train](195/100000) average regularization: 5.539624453376746e-06\n",
      "[proc 1][Train] 1 steps take 2.728 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.419, backward: 0.070, update: 2.237\n",
      "[proc 0][Train](195/100000) average pos_loss: 0.8724770545959473\n",
      "[proc 0][Train](195/100000) average neg_loss: 0.3146602511405945\n",
      "[proc 0][Train](195/100000) average loss: 0.5935686826705933\n",
      "[proc 0][Train](195/100000) average regularization: 5.61361503059743e-06\n",
      "[proc 0][Train] 1 steps take 2.679 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.442, backward: 0.069, update: 2.165\n",
      "[proc 1][Train](196/100000) average pos_loss: 0.829705536365509\n",
      "[proc 1][Train](196/100000) average neg_loss: 0.5495980978012085\n",
      "[proc 1][Train](196/100000) average loss: 0.6896518468856812\n",
      "[proc 1][Train](196/100000) average regularization: 5.460010015667649e-06\n",
      "[proc 1][Train] 1 steps take 2.592 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.434, backward: 0.070, update: 2.086\n",
      "[proc 0][Train](196/100000) average pos_loss: 0.8426225781440735\n",
      "[proc 0][Train](196/100000) average neg_loss: 0.5833357572555542\n",
      "[proc 0][Train](196/100000) average loss: 0.7129791975021362\n",
      "[proc 0][Train](196/100000) average regularization: 5.335671630746219e-06\n",
      "[proc 0][Train] 1 steps take 2.728 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.430, backward: 0.069, update: 2.228\n",
      "[proc 1][Train](197/100000) average pos_loss: 0.8583559393882751\n",
      "[proc 1][Train](197/100000) average neg_loss: 0.3244398236274719\n",
      "[proc 1][Train](197/100000) average loss: 0.5913978815078735\n",
      "[proc 1][Train](197/100000) average regularization: 5.3588937589665875e-06\n",
      "[proc 1][Train] 1 steps take 2.640 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.419, backward: 0.069, update: 2.150\n",
      "[proc 0][Train](197/100000) average pos_loss: 0.8396192193031311\n",
      "[proc 0][Train](197/100000) average neg_loss: 0.3041157126426697\n",
      "[proc 0][Train](197/100000) average loss: 0.5718674659729004\n",
      "[proc 0][Train](197/100000) average regularization: 5.315174803399714e-06\n",
      "[proc 0][Train] 1 steps take 2.666 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.419, backward: 0.070, update: 2.175\n",
      "[proc 1][Train](198/100000) average pos_loss: 0.8041632175445557\n",
      "[proc 1][Train](198/100000) average neg_loss: 0.5390250086784363\n",
      "[proc 1][Train](198/100000) average loss: 0.6715941429138184\n",
      "[proc 1][Train](198/100000) average regularization: 5.570444500335725e-06\n",
      "[proc 1][Train] 1 steps take 2.673 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.435, backward: 0.069, update: 2.167\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](198/100000) average pos_loss: 0.7827413082122803\n",
      "[proc 0][Train](198/100000) average neg_loss: 0.5266755223274231\n",
      "[proc 0][Train](198/100000) average loss: 0.6547083854675293\n",
      "[proc 0][Train](198/100000) average regularization: 5.451427568914369e-06\n",
      "[proc 0][Train] 1 steps take 2.641 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.431, backward: 0.071, update: 2.138\n",
      "[proc 1][Train](199/100000) average pos_loss: 0.7952890396118164\n",
      "[proc 1][Train](199/100000) average neg_loss: 0.36233651638031006\n",
      "[proc 1][Train](199/100000) average loss: 0.5788127779960632\n",
      "[proc 1][Train](199/100000) average regularization: 5.5439472816942725e-06\n",
      "[proc 1][Train] 1 steps take 2.650 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.428, backward: 0.069, update: 2.150\n",
      "[proc 0][Train](199/100000) average pos_loss: 0.8056408166885376\n",
      "[proc 0][Train](199/100000) average neg_loss: 0.34826964139938354\n",
      "[proc 0][Train](199/100000) average loss: 0.5769551992416382\n",
      "[proc 0][Train](199/100000) average regularization: 5.574903752858518e-06\n",
      "[proc 0][Train] 1 steps take 2.604 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.436, backward: 0.070, update: 2.097\n",
      "[proc 1][Train](200/100000) average pos_loss: 0.8560718297958374\n",
      "[proc 1][Train](200/100000) average neg_loss: 0.5015563368797302\n",
      "[proc 1][Train](200/100000) average loss: 0.6788140535354614\n",
      "[proc 1][Train](200/100000) average regularization: 5.488203896675259e-06\n",
      "[proc 1][Train] 1 steps take 2.640 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.431, backward: 0.069, update: 2.138\n",
      "[proc 0][Train](200/100000) average pos_loss: 0.8804076910018921\n",
      "[proc 0][Train](200/100000) average neg_loss: 0.5328623056411743\n",
      "[proc 0][Train](200/100000) average loss: 0.7066349983215332\n",
      "[proc 0][Train](200/100000) average regularization: 5.628580765915103e-06\n",
      "[proc 0][Train] 1 steps take 2.635 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.433, backward: 0.070, update: 2.132\n",
      "[proc 1][Train](201/100000) average pos_loss: 0.8443435430526733\n",
      "[proc 1][Train](201/100000) average neg_loss: 0.34785187244415283\n",
      "[proc 1][Train](201/100000) average loss: 0.5960977077484131\n",
      "[proc 1][Train](201/100000) average regularization: 5.70296788282576e-06\n",
      "[proc 1][Train] 1 steps take 2.689 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.437, backward: 0.070, update: 2.180\n",
      "[proc 0][Train](201/100000) average pos_loss: 0.813980221748352\n",
      "[proc 0][Train](201/100000) average neg_loss: 0.29845646023750305\n",
      "[proc 0][Train](201/100000) average loss: 0.5562183260917664\n",
      "[proc 0][Train](201/100000) average regularization: 5.520073045772733e-06\n",
      "[proc 0][Train] 1 steps take 2.633 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.421, backward: 0.069, update: 2.142\n",
      "[proc 1][Train](202/100000) average pos_loss: 0.7864174842834473\n",
      "[proc 1][Train](202/100000) average neg_loss: 0.5638637542724609\n",
      "[proc 1][Train](202/100000) average loss: 0.6751406192779541\n",
      "[proc 1][Train](202/100000) average regularization: 5.492664513440104e-06\n",
      "[proc 1][Train] 1 steps take 2.643 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.425, backward: 0.069, update: 2.148\n",
      "[proc 0][Train](202/100000) average pos_loss: 0.7948182225227356\n",
      "[proc 0][Train](202/100000) average neg_loss: 0.5735805034637451\n",
      "[proc 0][Train](202/100000) average loss: 0.684199333190918\n",
      "[proc 0][Train](202/100000) average regularization: 5.494870947586605e-06\n",
      "[proc 0][Train] 1 steps take 2.622 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.443, backward: 0.070, update: 2.107\n",
      "[proc 1][Train](203/100000) average pos_loss: 0.8094615340232849\n",
      "[proc 1][Train](203/100000) average neg_loss: 0.28546470403671265\n",
      "[proc 1][Train](203/100000) average loss: 0.5474631190299988\n",
      "[proc 1][Train](203/100000) average regularization: 5.8163209359918255e-06\n",
      "[proc 1][Train] 1 steps take 2.617 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.426, backward: 0.070, update: 2.119\n",
      "[proc 0][Train](203/100000) average pos_loss: 0.8503233194351196\n",
      "[proc 0][Train](203/100000) average neg_loss: 0.32682159543037415\n",
      "[proc 0][Train](203/100000) average loss: 0.5885724425315857\n",
      "[proc 0][Train](203/100000) average regularization: 5.860665169166168e-06\n",
      "[proc 0][Train] 1 steps take 2.637 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.435, backward: 0.069, update: 2.131\n",
      "[proc 1][Train](204/100000) average pos_loss: 0.8065052032470703\n",
      "[proc 1][Train](204/100000) average neg_loss: 0.5560436248779297\n",
      "[proc 1][Train](204/100000) average loss: 0.6812744140625\n",
      "[proc 1][Train](204/100000) average regularization: 5.313937435857952e-06\n",
      "[proc 1][Train] 1 steps take 2.635 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.416, backward: 0.070, update: 2.148\n",
      "[proc 0][Train](204/100000) average pos_loss: 0.8212260007858276\n",
      "[proc 0][Train](204/100000) average neg_loss: 0.5984444618225098\n",
      "[proc 0][Train](204/100000) average loss: 0.7098352313041687\n",
      "[proc 0][Train](204/100000) average regularization: 5.672114639310166e-06\n",
      "[proc 0][Train] 1 steps take 2.631 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.432, backward: 0.069, update: 2.128\n",
      "[proc 1][Train](205/100000) average pos_loss: 0.8893911838531494\n",
      "[proc 1][Train](205/100000) average neg_loss: 0.325987845659256\n",
      "[proc 1][Train](205/100000) average loss: 0.6076894998550415\n",
      "[proc 1][Train](205/100000) average regularization: 5.5024120229063556e-06\n",
      "[proc 1][Train] 1 steps take 2.643 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.427, backward: 0.070, update: 2.144\n",
      "[proc 0][Train](205/100000) average pos_loss: 0.8469510078430176\n",
      "[proc 0][Train](205/100000) average neg_loss: 0.31774190068244934\n",
      "[proc 0][Train](205/100000) average loss: 0.5823464393615723\n",
      "[proc 0][Train](205/100000) average regularization: 5.600491476798197e-06\n",
      "[proc 0][Train] 1 steps take 2.646 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.444, backward: 0.070, update: 2.130\n",
      "[proc 1][Train](206/100000) average pos_loss: 0.8126661777496338\n",
      "[proc 1][Train](206/100000) average neg_loss: 0.5530667304992676\n",
      "[proc 1][Train](206/100000) average loss: 0.6828664541244507\n",
      "[proc 1][Train](206/100000) average regularization: 5.481920652528061e-06\n",
      "[proc 1][Train] 1 steps take 2.626 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.427, backward: 0.070, update: 2.127\n",
      "[proc 0][Train](206/100000) average pos_loss: 0.8671643137931824\n",
      "[proc 0][Train](206/100000) average neg_loss: 0.5667902231216431\n",
      "[proc 0][Train](206/100000) average loss: 0.7169772386550903\n",
      "[proc 0][Train](206/100000) average regularization: 5.614243946183706e-06\n",
      "[proc 0][Train] 1 steps take 2.590 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.396, backward: 0.070, update: 2.123\n",
      "[proc 1][Train](207/100000) average pos_loss: 0.8037931323051453\n",
      "[proc 1][Train](207/100000) average neg_loss: 0.3021549582481384\n",
      "[proc 1][Train](207/100000) average loss: 0.5529740452766418\n",
      "[proc 1][Train](207/100000) average regularization: 5.649945705954451e-06\n",
      "[proc 1][Train] 1 steps take 2.619 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.425, backward: 0.070, update: 2.122\n",
      "[proc 0][Train](207/100000) average pos_loss: 0.7806310653686523\n",
      "[proc 0][Train](207/100000) average neg_loss: 0.326835960149765\n",
      "[proc 0][Train](207/100000) average loss: 0.5537335276603699\n",
      "[proc 0][Train](207/100000) average regularization: 5.824889285577228e-06\n",
      "[proc 0][Train] 1 steps take 2.619 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.435, backward: 0.069, update: 2.113\n",
      "[proc 1][Train](208/100000) average pos_loss: 0.8355919718742371\n",
      "[proc 1][Train](208/100000) average neg_loss: 0.5796852111816406\n",
      "[proc 1][Train](208/100000) average loss: 0.7076386213302612\n",
      "[proc 1][Train](208/100000) average regularization: 5.419824901764514e-06\n",
      "[proc 1][Train] 1 steps take 2.607 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.426, backward: 0.069, update: 2.111\n",
      "[proc 0][Train](208/100000) average pos_loss: 0.8078271150588989\n",
      "[proc 0][Train](208/100000) average neg_loss: 0.5975036025047302\n",
      "[proc 0][Train](208/100000) average loss: 0.7026653289794922\n",
      "[proc 0][Train](208/100000) average regularization: 5.672546649293508e-06\n",
      "[proc 0][Train] 1 steps take 2.703 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.405, backward: 0.070, update: 2.226\n",
      "[proc 1][Train](209/100000) average pos_loss: 0.8230302929878235\n",
      "[proc 1][Train](209/100000) average neg_loss: 0.30465179681777954\n",
      "[proc 1][Train](209/100000) average loss: 0.5638410449028015\n",
      "[proc 1][Train](209/100000) average regularization: 5.634726221614983e-06\n",
      "[proc 1][Train] 1 steps take 2.639 seconds\n",
      "[proc 1]sample: 0.015, forward: 0.406, backward: 0.070, update: 2.148\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](209/100000) average pos_loss: 0.837128221988678\n",
      "[proc 0][Train](209/100000) average neg_loss: 0.3421283960342407\n",
      "[proc 0][Train](209/100000) average loss: 0.5896283388137817\n",
      "[proc 0][Train](209/100000) average regularization: 5.6147282521124e-06\n",
      "[proc 0][Train] 1 steps take 2.692 seconds\n",
      "[proc 0]sample: 0.015, forward: 0.442, backward: 0.070, update: 2.165\n",
      "[proc 1][Train](210/100000) average pos_loss: 0.8224406838417053\n",
      "[proc 1][Train](210/100000) average neg_loss: 0.5582776069641113\n",
      "[proc 1][Train](210/100000) average loss: 0.6903591156005859\n",
      "[proc 1][Train](210/100000) average regularization: 5.740109827456763e-06\n",
      "[proc 1][Train] 1 steps take 2.666 seconds\n",
      "[proc 1]sample: 0.015, forward: 0.433, backward: 0.070, update: 2.148\n",
      "[proc 0][Train](210/100000) average pos_loss: 0.8413062691688538\n",
      "[proc 0][Train](210/100000) average neg_loss: 0.5268802642822266\n",
      "[proc 0][Train](210/100000) average loss: 0.6840932369232178\n",
      "[proc 0][Train](210/100000) average regularization: 5.6639469221408945e-06\n",
      "[proc 0][Train] 1 steps take 2.679 seconds\n",
      "[proc 0]sample: 0.018, forward: 0.434, backward: 0.071, update: 2.156\n",
      "[proc 1][Train](211/100000) average pos_loss: 0.8546938896179199\n",
      "[proc 1][Train](211/100000) average neg_loss: 0.303188294172287\n",
      "[proc 1][Train](211/100000) average loss: 0.5789411067962646\n",
      "[proc 1][Train](211/100000) average regularization: 5.472038537845947e-06\n",
      "[proc 1][Train] 1 steps take 2.662 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.430, backward: 0.070, update: 2.160\n",
      "[proc 0][Train](211/100000) average pos_loss: 0.7955067157745361\n",
      "[proc 0][Train](211/100000) average neg_loss: 0.30089980363845825\n",
      "[proc 0][Train](211/100000) average loss: 0.5482032299041748\n",
      "[proc 0][Train](211/100000) average regularization: 5.767109996668296e-06\n",
      "[proc 0][Train] 1 steps take 2.751 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.438, backward: 0.070, update: 2.241\n",
      "[proc 1][Train](212/100000) average pos_loss: 0.7698838710784912\n",
      "[proc 1][Train](212/100000) average neg_loss: 0.5990736484527588\n",
      "[proc 1][Train](212/100000) average loss: 0.684478759765625\n",
      "[proc 1][Train](212/100000) average regularization: 5.597007202595705e-06\n",
      "[proc 1][Train] 1 steps take 2.687 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.431, backward: 0.070, update: 2.185\n",
      "[proc 0][Train](212/100000) average pos_loss: 0.8055696487426758\n",
      "[proc 0][Train](212/100000) average neg_loss: 0.5851732492446899\n",
      "[proc 0][Train](212/100000) average loss: 0.6953714489936829\n",
      "[proc 0][Train](212/100000) average regularization: 5.654739652527496e-06\n",
      "[proc 0][Train] 1 steps take 2.702 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.435, backward: 0.070, update: 2.195\n",
      "[proc 1][Train](213/100000) average pos_loss: 0.8047467470169067\n",
      "[proc 1][Train](213/100000) average neg_loss: 0.34266191720962524\n",
      "[proc 1][Train](213/100000) average loss: 0.5737043619155884\n",
      "[proc 1][Train](213/100000) average regularization: 5.734982551075518e-06\n",
      "[proc 1][Train] 1 steps take 2.775 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.442, backward: 0.070, update: 2.263\n",
      "[proc 0][Train](213/100000) average pos_loss: 0.8136906623840332\n",
      "[proc 0][Train](213/100000) average neg_loss: 0.30757665634155273\n",
      "[proc 0][Train](213/100000) average loss: 0.560633659362793\n",
      "[proc 0][Train](213/100000) average regularization: 5.621319814963499e-06\n",
      "[proc 0][Train] 1 steps take 2.641 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.431, backward: 0.070, update: 2.139\n",
      "[proc 1][Train](214/100000) average pos_loss: 0.7860081195831299\n",
      "[proc 1][Train](214/100000) average neg_loss: 0.6152133345603943\n",
      "[proc 1][Train](214/100000) average loss: 0.7006107568740845\n",
      "[proc 1][Train](214/100000) average regularization: 5.695141226169653e-06\n",
      "[proc 1][Train] 1 steps take 2.637 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.439, backward: 0.069, update: 2.127\n",
      "[proc 0][Train](214/100000) average pos_loss: 0.8001586198806763\n",
      "[proc 0][Train](214/100000) average neg_loss: 0.5563117861747742\n",
      "[proc 0][Train](214/100000) average loss: 0.6782351732254028\n",
      "[proc 0][Train](214/100000) average regularization: 5.861375029780902e-06\n",
      "[proc 0][Train] 1 steps take 2.653 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.430, backward: 0.070, update: 2.151\n",
      "[proc 1][Train](215/100000) average pos_loss: 0.8537931442260742\n",
      "[proc 1][Train](215/100000) average neg_loss: 0.3059622347354889\n",
      "[proc 1][Train](215/100000) average loss: 0.5798776745796204\n",
      "[proc 1][Train](215/100000) average regularization: 5.746984243160114e-06\n",
      "[proc 1][Train] 1 steps take 2.678 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.429, backward: 0.071, update: 2.177\n",
      "[proc 0][Train](215/100000) average pos_loss: 0.7652268409729004\n",
      "[proc 0][Train](215/100000) average neg_loss: 0.32316482067108154\n",
      "[proc 0][Train](215/100000) average loss: 0.544195830821991\n",
      "[proc 0][Train](215/100000) average regularization: 5.936302841291763e-06\n",
      "[proc 0][Train] 1 steps take 2.733 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.429, backward: 0.070, update: 2.232\n",
      "[proc 1][Train](216/100000) average pos_loss: 0.7804234027862549\n",
      "[proc 1][Train](216/100000) average neg_loss: 0.5265085101127625\n",
      "[proc 1][Train](216/100000) average loss: 0.653465986251831\n",
      "[proc 1][Train](216/100000) average regularization: 5.961016086075688e-06\n",
      "[proc 1][Train] 1 steps take 2.647 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.437, backward: 0.070, update: 2.137\n",
      "[proc 0][Train](216/100000) average pos_loss: 0.8013442158699036\n",
      "[proc 0][Train](216/100000) average neg_loss: 0.5797699689865112\n",
      "[proc 0][Train](216/100000) average loss: 0.6905571222305298\n",
      "[proc 0][Train](216/100000) average regularization: 5.732643330702558e-06\n",
      "[proc 0][Train] 1 steps take 2.976 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.437, backward: 0.069, update: 2.468\n",
      "[proc 1][Train](217/100000) average pos_loss: 0.7844536304473877\n",
      "[proc 1][Train](217/100000) average neg_loss: 0.3104626536369324\n",
      "[proc 1][Train](217/100000) average loss: 0.5474581718444824\n",
      "[proc 1][Train](217/100000) average regularization: 5.893740308238193e-06\n",
      "[proc 1][Train] 1 steps take 2.824 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.428, backward: 0.069, update: 2.326\n",
      "[proc 0][Train](217/100000) average pos_loss: 0.8243229389190674\n",
      "[proc 0][Train](217/100000) average neg_loss: 0.307007372379303\n",
      "[proc 0][Train](217/100000) average loss: 0.5656651258468628\n",
      "[proc 0][Train](217/100000) average regularization: 5.6395861065539066e-06\n",
      "[proc 0][Train] 1 steps take 2.762 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.451, backward: 0.069, update: 2.240\n",
      "[proc 1][Train](218/100000) average pos_loss: 0.7704957127571106\n",
      "[proc 1][Train](218/100000) average neg_loss: 0.5785473585128784\n",
      "[proc 1][Train](218/100000) average loss: 0.6745215654373169\n",
      "[proc 1][Train](218/100000) average regularization: 5.760167823609663e-06\n",
      "[proc 1][Train] 1 steps take 2.658 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.434, backward: 0.070, update: 2.152\n",
      "[proc 0][Train](218/100000) average pos_loss: 0.7907377481460571\n",
      "[proc 0][Train](218/100000) average neg_loss: 0.5358850359916687\n",
      "[proc 0][Train](218/100000) average loss: 0.6633113622665405\n",
      "[proc 0][Train](218/100000) average regularization: 5.924546712776646e-06\n",
      "[proc 0][Train] 1 steps take 2.813 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.474, backward: 0.070, update: 2.267\n",
      "[proc 1][Train](219/100000) average pos_loss: 0.8051372766494751\n",
      "[proc 1][Train](219/100000) average neg_loss: 0.32509273290634155\n",
      "[proc 1][Train](219/100000) average loss: 0.5651149749755859\n",
      "[proc 1][Train](219/100000) average regularization: 5.8287173487769905e-06\n",
      "[proc 1][Train] 1 steps take 2.870 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.421, backward: 0.070, update: 2.377\n",
      "[proc 0][Train](219/100000) average pos_loss: 0.8198221921920776\n",
      "[proc 0][Train](219/100000) average neg_loss: 0.3222048282623291\n",
      "[proc 0][Train](219/100000) average loss: 0.5710135102272034\n",
      "[proc 0][Train](219/100000) average regularization: 5.786178462585667e-06\n",
      "[proc 0][Train] 1 steps take 2.691 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.470, backward: 0.069, update: 2.149\n",
      "[proc 1][Train](220/100000) average pos_loss: 0.7263476252555847\n",
      "[proc 1][Train](220/100000) average neg_loss: 0.5779710412025452\n",
      "[proc 1][Train](220/100000) average loss: 0.6521593332290649\n",
      "[proc 1][Train](220/100000) average regularization: 5.850193701917306e-06\n",
      "[proc 1][Train] 1 steps take 2.804 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.444, backward: 0.069, update: 2.289\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](220/100000) average pos_loss: 0.7794954776763916\n",
      "[proc 0][Train](220/100000) average neg_loss: 0.5654606223106384\n",
      "[proc 0][Train](220/100000) average loss: 0.6724780797958374\n",
      "[proc 0][Train](220/100000) average regularization: 5.951173534413101e-06\n",
      "[proc 0][Train] 1 steps take 2.784 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.439, backward: 0.070, update: 2.273\n",
      "[proc 1][Train](221/100000) average pos_loss: 0.8082103729248047\n",
      "[proc 1][Train](221/100000) average neg_loss: 0.31316250562667847\n",
      "[proc 1][Train](221/100000) average loss: 0.560686469078064\n",
      "[proc 1][Train](221/100000) average regularization: 5.835905540152453e-06\n",
      "[proc 1][Train] 1 steps take 2.853 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.476, backward: 0.069, update: 2.306\n",
      "[proc 0][Train](221/100000) average pos_loss: 0.7710275650024414\n",
      "[proc 0][Train](221/100000) average neg_loss: 0.3137434720993042\n",
      "[proc 0][Train](221/100000) average loss: 0.5423855185508728\n",
      "[proc 0][Train](221/100000) average regularization: 5.801909082947532e-06\n",
      "[proc 0][Train] 1 steps take 2.835 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.440, backward: 0.070, update: 2.323\n",
      "[proc 1][Train](222/100000) average pos_loss: 0.8130007386207581\n",
      "[proc 1][Train](222/100000) average neg_loss: 0.5440419912338257\n",
      "[proc 1][Train](222/100000) average loss: 0.6785213947296143\n",
      "[proc 1][Train](222/100000) average regularization: 5.780340870842338e-06\n",
      "[proc 1][Train] 1 steps take 2.697 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.430, backward: 0.070, update: 2.194\n",
      "[proc 0][Train](222/100000) average pos_loss: 0.8358018398284912\n",
      "[proc 0][Train](222/100000) average neg_loss: 0.5605175495147705\n",
      "[proc 0][Train](222/100000) average loss: 0.6981596946716309\n",
      "[proc 0][Train](222/100000) average regularization: 5.884426172997337e-06\n",
      "[proc 0][Train] 1 steps take 2.684 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.440, backward: 0.070, update: 2.172\n",
      "[proc 1][Train](223/100000) average pos_loss: 0.8523481488227844\n",
      "[proc 1][Train](223/100000) average neg_loss: 0.2744947671890259\n",
      "[proc 1][Train](223/100000) average loss: 0.5634214878082275\n",
      "[proc 1][Train](223/100000) average regularization: 5.727391908294521e-06\n",
      "[proc 1][Train] 1 steps take 2.818 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.398, backward: 0.070, update: 2.348\n",
      "[proc 0][Train](223/100000) average pos_loss: 0.7616009712219238\n",
      "[proc 0][Train](223/100000) average neg_loss: 0.3007211685180664\n",
      "[proc 0][Train](223/100000) average loss: 0.5311610698699951\n",
      "[proc 0][Train](223/100000) average regularization: 5.996766503812978e-06\n",
      "[proc 0][Train] 1 steps take 2.690 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.446, backward: 0.070, update: 2.172\n",
      "[proc 1][Train](224/100000) average pos_loss: 0.7561820149421692\n",
      "[proc 1][Train](224/100000) average neg_loss: 0.5761858224868774\n",
      "[proc 1][Train](224/100000) average loss: 0.6661839485168457\n",
      "[proc 1][Train](224/100000) average regularization: 5.892949502595002e-06\n",
      "[proc 1][Train] 1 steps take 2.699 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.436, backward: 0.070, update: 2.191\n",
      "[proc 0][Train](224/100000) average pos_loss: 0.8081938028335571\n",
      "[proc 0][Train](224/100000) average neg_loss: 0.5295431017875671\n",
      "[proc 0][Train](224/100000) average loss: 0.6688684225082397\n",
      "[proc 0][Train](224/100000) average regularization: 5.833509931107983e-06\n",
      "[proc 0][Train] 1 steps take 2.745 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.433, backward: 0.069, update: 2.241\n",
      "[proc 1][Train](225/100000) average pos_loss: 0.8283129334449768\n",
      "[proc 1][Train](225/100000) average neg_loss: 0.31305962800979614\n",
      "[proc 1][Train](225/100000) average loss: 0.5706862807273865\n",
      "[proc 1][Train](225/100000) average regularization: 5.661506747856038e-06\n",
      "[proc 1][Train] 1 steps take 2.932 seconds\n",
      "[proc 1]sample: 0.015, forward: 0.439, backward: 0.070, update: 2.408\n",
      "[proc 0][Train](225/100000) average pos_loss: 0.806556224822998\n",
      "[proc 0][Train](225/100000) average neg_loss: 0.31875160336494446\n",
      "[proc 0][Train](225/100000) average loss: 0.5626538991928101\n",
      "[proc 0][Train](225/100000) average regularization: 5.7079614634858444e-06\n",
      "[proc 0][Train] 1 steps take 2.452 seconds\n",
      "[proc 0]sample: 0.018, forward: 0.387, backward: 0.069, update: 1.977\n",
      "[proc 1][Train](226/100000) average pos_loss: 0.7647395133972168\n",
      "[proc 1][Train](226/100000) average neg_loss: 0.556439220905304\n",
      "[proc 1][Train](226/100000) average loss: 0.660589337348938\n",
      "[proc 1][Train](226/100000) average regularization: 5.692242211807752e-06\n",
      "[proc 1][Train] 1 steps take 2.669 seconds\n",
      "[proc 1]sample: 0.019, forward: 0.434, backward: 0.070, update: 2.145\n",
      "[proc 0][Train](226/100000) average pos_loss: 0.7762333750724792\n",
      "[proc 0][Train](226/100000) average neg_loss: 0.5448724627494812\n",
      "[proc 0][Train](226/100000) average loss: 0.6605529189109802\n",
      "[proc 0][Train](226/100000) average regularization: 5.846309250046033e-06\n",
      "[proc 0][Train] 1 steps take 2.663 seconds\n",
      "[proc 0]sample: 0.025, forward: 0.429, backward: 0.070, update: 2.138\n",
      "[proc 1][Train](227/100000) average pos_loss: 0.803913414478302\n",
      "[proc 1][Train](227/100000) average neg_loss: 0.3134787082672119\n",
      "[proc 1][Train](227/100000) average loss: 0.5586960315704346\n",
      "[proc 1][Train](227/100000) average regularization: 5.745557245973032e-06\n",
      "[proc 1][Train] 1 steps take 2.544 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.429, backward: 0.070, update: 2.042\n",
      "[proc 0][Train](227/100000) average pos_loss: 0.8061639070510864\n",
      "[proc 0][Train](227/100000) average neg_loss: 0.31048911809921265\n",
      "[proc 0][Train](227/100000) average loss: 0.5583264827728271\n",
      "[proc 0][Train](227/100000) average regularization: 5.715181487175869e-06\n",
      "[proc 0][Train] 1 steps take 2.734 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.432, backward: 0.070, update: 2.229\n",
      "[proc 1][Train](228/100000) average pos_loss: 0.7455250024795532\n",
      "[proc 1][Train](228/100000) average neg_loss: 0.5603244304656982\n",
      "[proc 1][Train](228/100000) average loss: 0.6529247164726257\n",
      "[proc 1][Train](228/100000) average regularization: 5.816075372422347e-06\n",
      "[proc 1][Train] 1 steps take 2.620 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.414, backward: 0.070, update: 2.134\n",
      "[proc 0][Train](228/100000) average pos_loss: 0.7728402614593506\n",
      "[proc 0][Train](228/100000) average neg_loss: 0.582993745803833\n",
      "[proc 0][Train](228/100000) average loss: 0.6779170036315918\n",
      "[proc 0][Train](228/100000) average regularization: 5.823503215651726e-06\n",
      "[proc 0][Train] 1 steps take 2.593 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.427, backward: 0.069, update: 2.095\n",
      "[proc 1][Train](229/100000) average pos_loss: 0.7824726104736328\n",
      "[proc 1][Train](229/100000) average neg_loss: 0.3164291977882385\n",
      "[proc 1][Train](229/100000) average loss: 0.5494508743286133\n",
      "[proc 1][Train](229/100000) average regularization: 6.032621058693621e-06\n",
      "[proc 1][Train] 1 steps take 2.588 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.432, backward: 0.070, update: 2.085\n",
      "[proc 0][Train](229/100000) average pos_loss: 0.8141460418701172\n",
      "[proc 0][Train](229/100000) average neg_loss: 0.3247959613800049\n",
      "[proc 0][Train](229/100000) average loss: 0.569471001625061\n",
      "[proc 0][Train](229/100000) average regularization: 5.53749760001665e-06\n",
      "[proc 0][Train] 1 steps take 2.645 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.432, backward: 0.070, update: 2.142\n",
      "[proc 1][Train](230/100000) average pos_loss: 0.775849461555481\n",
      "[proc 1][Train](230/100000) average neg_loss: 0.6150027513504028\n",
      "[proc 1][Train](230/100000) average loss: 0.6954261064529419\n",
      "[proc 1][Train](230/100000) average regularization: 5.713518021366326e-06\n",
      "[proc 1][Train] 1 steps take 2.637 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.433, backward: 0.069, update: 2.134\n",
      "[proc 0][Train](230/100000) average pos_loss: 0.7544500827789307\n",
      "[proc 0][Train](230/100000) average neg_loss: 0.5354130268096924\n",
      "[proc 0][Train](230/100000) average loss: 0.6449315547943115\n",
      "[proc 0][Train](230/100000) average regularization: 5.8322302720625885e-06\n",
      "[proc 0][Train] 1 steps take 2.602 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.409, backward: 0.070, update: 2.121\n",
      "[proc 1][Train](231/100000) average pos_loss: 0.7952370047569275\n",
      "[proc 1][Train](231/100000) average neg_loss: 0.30060282349586487\n",
      "[proc 1][Train](231/100000) average loss: 0.5479199290275574\n",
      "[proc 1][Train](231/100000) average regularization: 5.818742010887945e-06\n",
      "[proc 1][Train] 1 steps take 2.653 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.422, backward: 0.070, update: 2.160\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 0][Train](231/100000) average pos_loss: 0.7872241735458374\n",
      "[proc 0][Train](231/100000) average neg_loss: 0.3200227916240692\n",
      "[proc 0][Train](231/100000) average loss: 0.5536234974861145\n",
      "[proc 0][Train](231/100000) average regularization: 5.793119271402247e-06\n",
      "[proc 0][Train] 1 steps take 2.635 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.435, backward: 0.070, update: 2.129\n",
      "[proc 1][Train](232/100000) average pos_loss: 0.7785314917564392\n",
      "[proc 1][Train](232/100000) average neg_loss: 0.5747613906860352\n",
      "[proc 1][Train](232/100000) average loss: 0.6766464710235596\n",
      "[proc 1][Train](232/100000) average regularization: 5.941182280366775e-06\n",
      "[proc 1][Train] 1 steps take 2.636 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.423, backward: 0.070, update: 2.142\n",
      "[proc 0][Train](232/100000) average pos_loss: 0.7963588237762451\n",
      "[proc 0][Train](232/100000) average neg_loss: 0.6397329568862915\n",
      "[proc 0][Train](232/100000) average loss: 0.7180458903312683\n",
      "[proc 0][Train](232/100000) average regularization: 5.688778855983401e-06\n",
      "[proc 0][Train] 1 steps take 2.600 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.439, backward: 0.069, update: 2.091\n",
      "[proc 1][Train](233/100000) average pos_loss: 0.7838908433914185\n",
      "[proc 1][Train](233/100000) average neg_loss: 0.3109341263771057\n",
      "[proc 1][Train](233/100000) average loss: 0.5474125146865845\n",
      "[proc 1][Train](233/100000) average regularization: 5.822559160151286e-06\n",
      "[proc 1][Train] 1 steps take 2.656 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.426, backward: 0.071, update: 2.158\n",
      "[proc 0][Train](233/100000) average pos_loss: 0.7759450674057007\n",
      "[proc 0][Train](233/100000) average neg_loss: 0.2997118830680847\n",
      "[proc 0][Train](233/100000) average loss: 0.5378284454345703\n",
      "[proc 0][Train](233/100000) average regularization: 6.041032975190319e-06\n",
      "[proc 0][Train] 1 steps take 2.637 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.436, backward: 0.069, update: 2.131\n",
      "[proc 1][Train](234/100000) average pos_loss: 0.7566536664962769\n",
      "[proc 1][Train](234/100000) average neg_loss: 0.5747054815292358\n",
      "[proc 1][Train](234/100000) average loss: 0.6656795740127563\n",
      "[proc 1][Train](234/100000) average regularization: 6.045325790182687e-06\n",
      "[proc 1][Train] 1 steps take 2.619 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.425, backward: 0.069, update: 2.123\n",
      "[proc 0][Train](234/100000) average pos_loss: 0.7544997930526733\n",
      "[proc 0][Train](234/100000) average neg_loss: 0.5139222145080566\n",
      "[proc 0][Train](234/100000) average loss: 0.634211003780365\n",
      "[proc 0][Train](234/100000) average regularization: 5.941594736214029e-06\n",
      "[proc 0][Train] 1 steps take 2.620 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.436, backward: 0.070, update: 2.113\n",
      "[proc 1][Train](235/100000) average pos_loss: 0.798676073551178\n",
      "[proc 1][Train](235/100000) average neg_loss: 0.32878705859184265\n",
      "[proc 1][Train](235/100000) average loss: 0.5637315511703491\n",
      "[proc 1][Train](235/100000) average regularization: 6.140300229162676e-06\n",
      "[proc 1][Train] 1 steps take 2.797 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.501, backward: 0.070, update: 2.225\n",
      "[proc 0][Train](235/100000) average pos_loss: 0.7849620580673218\n",
      "[proc 0][Train](235/100000) average neg_loss: 0.32902276515960693\n",
      "[proc 0][Train](235/100000) average loss: 0.5569924116134644\n",
      "[proc 0][Train](235/100000) average regularization: 6.024105459800921e-06\n",
      "[proc 0][Train] 1 steps take 2.552 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.400, backward: 0.069, update: 2.081\n",
      "[proc 1][Train](236/100000) average pos_loss: 0.7898989915847778\n",
      "[proc 1][Train](236/100000) average neg_loss: 0.4799840450286865\n",
      "[proc 1][Train](236/100000) average loss: 0.6349415183067322\n",
      "[proc 1][Train](236/100000) average regularization: 5.801953193440568e-06\n",
      "[proc 1][Train] 1 steps take 2.780 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.466, backward: 0.070, update: 2.242\n",
      "[proc 0][Train](236/100000) average pos_loss: 0.7442545890808105\n",
      "[proc 0][Train](236/100000) average neg_loss: 0.6696811318397522\n",
      "[proc 0][Train](236/100000) average loss: 0.706967830657959\n",
      "[proc 0][Train](236/100000) average regularization: 5.9926073845417704e-06\n",
      "[proc 0][Train] 1 steps take 2.701 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.433, backward: 0.069, update: 2.197\n",
      "[proc 1][Train](237/100000) average pos_loss: 0.8105971217155457\n",
      "[proc 1][Train](237/100000) average neg_loss: 0.2878909409046173\n",
      "[proc 1][Train](237/100000) average loss: 0.5492440462112427\n",
      "[proc 1][Train](237/100000) average regularization: 5.712131041946122e-06\n",
      "[proc 1][Train] 1 steps take 2.804 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.479, backward: 0.070, update: 2.253\n",
      "[proc 0][Train](237/100000) average pos_loss: 0.7783606052398682\n",
      "[proc 0][Train](237/100000) average neg_loss: 0.2913113832473755\n",
      "[proc 0][Train](237/100000) average loss: 0.5348359942436218\n",
      "[proc 0][Train](237/100000) average regularization: 5.9887743191211484e-06\n",
      "[proc 0][Train] 1 steps take 2.803 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.475, backward: 0.069, update: 2.257\n",
      "[proc 1][Train](238/100000) average pos_loss: 0.6965693235397339\n",
      "[proc 1][Train](238/100000) average neg_loss: 0.5760266780853271\n",
      "[proc 1][Train](238/100000) average loss: 0.6362980008125305\n",
      "[proc 1][Train](238/100000) average regularization: 5.990343197481707e-06\n",
      "[proc 1][Train] 1 steps take 2.804 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.480, backward: 0.070, update: 2.251\n",
      "[proc 0][Train](238/100000) average pos_loss: 0.7378268241882324\n",
      "[proc 0][Train](238/100000) average neg_loss: 0.5979932546615601\n",
      "[proc 0][Train](238/100000) average loss: 0.6679100394248962\n",
      "[proc 0][Train](238/100000) average regularization: 5.758117822551867e-06\n",
      "[proc 0][Train] 1 steps take 2.755 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.472, backward: 0.070, update: 2.211\n",
      "[proc 1][Train](239/100000) average pos_loss: 0.7939597964286804\n",
      "[proc 1][Train](239/100000) average neg_loss: 0.3040544092655182\n",
      "[proc 1][Train](239/100000) average loss: 0.5490071177482605\n",
      "[proc 1][Train](239/100000) average regularization: 5.9311332734068856e-06\n",
      "[proc 1][Train] 1 steps take 2.827 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.481, backward: 0.069, update: 2.275\n",
      "[proc 0][Train](239/100000) average pos_loss: 0.8010693192481995\n",
      "[proc 0][Train](239/100000) average neg_loss: 0.3290400803089142\n",
      "[proc 0][Train](239/100000) average loss: 0.565054714679718\n",
      "[proc 0][Train](239/100000) average regularization: 5.8074615481018554e-06\n",
      "[proc 0][Train] 1 steps take 2.738 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.433, backward: 0.070, update: 2.234\n",
      "[proc 0][Train](240/100000) average pos_loss: 0.7073534727096558\n",
      "[proc 0][Train](240/100000) average neg_loss: 0.5491023063659668\n",
      "[proc 0][Train](240/100000) average loss: 0.6282278895378113\n",
      "[proc 0][Train](240/100000) average regularization: 5.795687229692703e-06\n",
      "[proc 0][Train] 1 steps take 2.757 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.474, backward: 0.069, update: 2.212\n",
      "[proc 1][Train](240/100000) average pos_loss: 0.7508074641227722\n",
      "[proc 1][Train](240/100000) average neg_loss: 0.619691789150238\n",
      "[proc 1][Train](240/100000) average loss: 0.6852496266365051\n",
      "[proc 1][Train](240/100000) average regularization: 5.843150574946776e-06\n",
      "[proc 1][Train] 1 steps take 3.186 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.491, backward: 0.070, update: 2.623\n",
      "[proc 0][Train](241/100000) average pos_loss: 0.8329527378082275\n",
      "[proc 0][Train](241/100000) average neg_loss: 0.3144194781780243\n",
      "[proc 0][Train](241/100000) average loss: 0.5736861228942871\n",
      "[proc 0][Train](241/100000) average regularization: 5.9202711781836115e-06\n",
      "[proc 0][Train] 1 steps take 2.686 seconds\n",
      "[proc 0]sample: 0.015, forward: 0.479, backward: 0.069, update: 2.122\n",
      "[proc 1][Train](241/100000) average pos_loss: 0.7810678482055664\n",
      "[proc 1][Train](241/100000) average neg_loss: 0.30759167671203613\n",
      "[proc 1][Train](241/100000) average loss: 0.5443297624588013\n",
      "[proc 1][Train](241/100000) average regularization: 6.084264441597043e-06\n",
      "[proc 1][Train] 1 steps take 2.808 seconds\n",
      "[proc 1]sample: 0.016, forward: 0.438, backward: 0.073, update: 2.281\n",
      "[proc 0][Train](242/100000) average pos_loss: 0.7278235554695129\n",
      "[proc 0][Train](242/100000) average neg_loss: 0.5834206938743591\n",
      "[proc 0][Train](242/100000) average loss: 0.655622124671936\n",
      "[proc 0][Train](242/100000) average regularization: 5.955559572612401e-06\n",
      "[proc 0][Train] 1 steps take 2.592 seconds\n",
      "[proc 0]sample: 0.016, forward: 0.432, backward: 0.071, update: 2.073\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 1][Train](242/100000) average pos_loss: 0.7411348819732666\n",
      "[proc 1][Train](242/100000) average neg_loss: 0.5570770502090454\n",
      "[proc 1][Train](242/100000) average loss: 0.649105966091156\n",
      "[proc 1][Train](242/100000) average regularization: 5.980419700790662e-06\n",
      "[proc 1][Train] 1 steps take 2.659 seconds\n",
      "[proc 1]sample: 0.019, forward: 0.436, backward: 0.072, update: 2.133\n",
      "[proc 0][Train](243/100000) average pos_loss: 0.7649781703948975\n",
      "[proc 0][Train](243/100000) average neg_loss: 0.29810529947280884\n",
      "[proc 0][Train](243/100000) average loss: 0.5315417051315308\n",
      "[proc 0][Train](243/100000) average regularization: 6.049957846698817e-06\n",
      "[proc 0][Train] 1 steps take 2.547 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.425, backward: 0.070, update: 2.051\n",
      "[proc 1][Train](243/100000) average pos_loss: 0.8019920587539673\n",
      "[proc 1][Train](243/100000) average neg_loss: 0.3242569863796234\n",
      "[proc 1][Train](243/100000) average loss: 0.5631245374679565\n",
      "[proc 1][Train](243/100000) average regularization: 6.170165306684794e-06\n",
      "[proc 1][Train] 1 steps take 2.718 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.433, backward: 0.069, update: 2.214\n",
      "[proc 0][Train](244/100000) average pos_loss: 0.7724372148513794\n",
      "[proc 0][Train](244/100000) average neg_loss: 0.5339362621307373\n",
      "[proc 0][Train](244/100000) average loss: 0.6531867384910583\n",
      "[proc 0][Train](244/100000) average regularization: 5.935996796324616e-06\n",
      "[proc 0][Train] 1 steps take 2.575 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.419, backward: 0.070, update: 2.084\n",
      "[proc 1][Train](244/100000) average pos_loss: 0.7674298882484436\n",
      "[proc 1][Train](244/100000) average neg_loss: 0.5990760922431946\n",
      "[proc 1][Train](244/100000) average loss: 0.6832529902458191\n",
      "[proc 1][Train](244/100000) average regularization: 5.998922915750882e-06\n",
      "[proc 1][Train] 1 steps take 2.617 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.434, backward: 0.070, update: 2.112\n",
      "[proc 0][Train](245/100000) average pos_loss: 0.7977616786956787\n",
      "[proc 0][Train](245/100000) average neg_loss: 0.3315800428390503\n",
      "[proc 0][Train](245/100000) average loss: 0.5646708607673645\n",
      "[proc 0][Train](245/100000) average regularization: 6.0810002651123796e-06\n",
      "[proc 0][Train] 1 steps take 2.613 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.427, backward: 0.069, update: 2.115\n",
      "[proc 1][Train](245/100000) average pos_loss: 0.7418158054351807\n",
      "[proc 1][Train](245/100000) average neg_loss: 0.30877479910850525\n",
      "[proc 1][Train](245/100000) average loss: 0.5252953171730042\n",
      "[proc 1][Train](245/100000) average regularization: 6.126011612650473e-06\n",
      "[proc 1][Train] 1 steps take 2.635 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.411, backward: 0.069, update: 2.154\n",
      "[proc 0][Train](246/100000) average pos_loss: 0.732225775718689\n",
      "[proc 0][Train](246/100000) average neg_loss: 0.5757708549499512\n",
      "[proc 0][Train](246/100000) average loss: 0.6539983153343201\n",
      "[proc 0][Train](246/100000) average regularization: 6.187034614413278e-06\n",
      "[proc 0][Train] 1 steps take 2.643 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.436, backward: 0.070, update: 2.137\n",
      "[proc 1][Train](246/100000) average pos_loss: 0.7539808750152588\n",
      "[proc 1][Train](246/100000) average neg_loss: 0.5563933849334717\n",
      "[proc 1][Train](246/100000) average loss: 0.6551871299743652\n",
      "[proc 1][Train](246/100000) average regularization: 6.0091410887253005e-06\n",
      "[proc 1][Train] 1 steps take 2.694 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.411, backward: 0.070, update: 2.211\n",
      "[proc 0][Train](247/100000) average pos_loss: 0.7914159893989563\n",
      "[proc 0][Train](247/100000) average neg_loss: 0.3174252510070801\n",
      "[proc 0][Train](247/100000) average loss: 0.5544205904006958\n",
      "[proc 0][Train](247/100000) average regularization: 6.173334440973122e-06\n",
      "[proc 0][Train] 1 steps take 2.700 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.437, backward: 0.070, update: 2.192\n",
      "[proc 1][Train](247/100000) average pos_loss: 0.7741713523864746\n",
      "[proc 1][Train](247/100000) average neg_loss: 0.3421335220336914\n",
      "[proc 1][Train](247/100000) average loss: 0.558152437210083\n",
      "[proc 1][Train](247/100000) average regularization: 6.026799837854924e-06\n",
      "[proc 1][Train] 1 steps take 2.668 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.431, backward: 0.070, update: 2.164\n",
      "[proc 0][Train](248/100000) average pos_loss: 0.7993669509887695\n",
      "[proc 0][Train](248/100000) average neg_loss: 0.5556543469429016\n",
      "[proc 0][Train](248/100000) average loss: 0.6775106191635132\n",
      "[proc 0][Train](248/100000) average regularization: 5.9539916037465446e-06\n",
      "[proc 0][Train] 1 steps take 2.623 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.446, backward: 0.069, update: 2.106\n",
      "[proc 1][Train](248/100000) average pos_loss: 0.7115781307220459\n",
      "[proc 1][Train](248/100000) average neg_loss: 0.5450140833854675\n",
      "[proc 1][Train](248/100000) average loss: 0.6282961368560791\n",
      "[proc 1][Train](248/100000) average regularization: 6.100761765992502e-06\n",
      "[proc 1][Train] 1 steps take 2.607 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.429, backward: 0.070, update: 2.107\n",
      "[proc 0][Train](249/100000) average pos_loss: 0.7670702338218689\n",
      "[proc 0][Train](249/100000) average neg_loss: 0.312280535697937\n",
      "[proc 0][Train](249/100000) average loss: 0.5396753549575806\n",
      "[proc 0][Train](249/100000) average regularization: 6.206637863215292e-06\n",
      "[proc 0][Train] 1 steps take 2.722 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.432, backward: 0.069, update: 2.219\n",
      "[proc 1][Train](249/100000) average pos_loss: 0.7691465616226196\n",
      "[proc 1][Train](249/100000) average neg_loss: 0.27198559045791626\n",
      "[proc 1][Train](249/100000) average loss: 0.5205661058425903\n",
      "[proc 1][Train](249/100000) average regularization: 6.138527623988921e-06\n",
      "[proc 1][Train] 1 steps take 2.613 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.403, backward: 0.071, update: 2.138\n",
      "[proc 0][Train](250/100000) average pos_loss: 0.7084562182426453\n",
      "[proc 0][Train](250/100000) average neg_loss: 0.6128490567207336\n",
      "[proc 0][Train](250/100000) average loss: 0.6606526374816895\n",
      "[proc 0][Train](250/100000) average regularization: 6.003791895636823e-06\n",
      "[proc 0][Train] 1 steps take 2.627 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.429, backward: 0.070, update: 2.126\n",
      "[proc 1][Train](250/100000) average pos_loss: 0.7413188219070435\n",
      "[proc 1][Train](250/100000) average neg_loss: 0.6140384674072266\n",
      "[proc 1][Train](250/100000) average loss: 0.677678644657135\n",
      "[proc 1][Train](250/100000) average regularization: 6.034149464539951e-06\n",
      "[proc 1][Train] 1 steps take 2.704 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.422, backward: 0.070, update: 2.211\n",
      "[proc 0][Train](251/100000) average pos_loss: 0.7486212253570557\n",
      "[proc 0][Train](251/100000) average neg_loss: 0.3020724952220917\n",
      "[proc 0][Train](251/100000) average loss: 0.5253468751907349\n",
      "[proc 0][Train](251/100000) average regularization: 6.311267043201951e-06\n",
      "[proc 0][Train] 1 steps take 2.672 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.443, backward: 0.070, update: 2.157\n",
      "[proc 1][Train](251/100000) average pos_loss: 0.7456027269363403\n",
      "[proc 1][Train](251/100000) average neg_loss: 0.2931658923625946\n",
      "[proc 1][Train](251/100000) average loss: 0.5193843245506287\n",
      "[proc 1][Train](251/100000) average regularization: 6.012040557834553e-06\n",
      "[proc 1][Train] 1 steps take 2.624 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.407, backward: 0.070, update: 2.145\n",
      "[proc 0][Train](252/100000) average pos_loss: 0.7415997385978699\n",
      "[proc 0][Train](252/100000) average neg_loss: 0.540468692779541\n",
      "[proc 0][Train](252/100000) average loss: 0.6410342454910278\n",
      "[proc 0][Train](252/100000) average regularization: 6.119037152529927e-06\n",
      "[proc 0][Train] 1 steps take 2.616 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.432, backward: 0.070, update: 2.111\n",
      "[proc 1][Train](252/100000) average pos_loss: 0.7368814945220947\n",
      "[proc 1][Train](252/100000) average neg_loss: 0.5592201948165894\n",
      "[proc 1][Train](252/100000) average loss: 0.648050844669342\n",
      "[proc 1][Train](252/100000) average regularization: 6.0308138927211985e-06\n",
      "[proc 1][Train] 1 steps take 2.612 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.402, backward: 0.069, update: 2.138\n",
      "[proc 0][Train](253/100000) average pos_loss: 0.7265322804450989\n",
      "[proc 0][Train](253/100000) average neg_loss: 0.2966994643211365\n",
      "[proc 0][Train](253/100000) average loss: 0.5116158723831177\n",
      "[proc 0][Train](253/100000) average regularization: 6.2620970311400015e-06\n",
      "[proc 0][Train] 1 steps take 2.694 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.463, backward: 0.069, update: 2.160\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 1][Train](253/100000) average pos_loss: 0.6882216930389404\n",
      "[proc 1][Train](253/100000) average neg_loss: 0.3222900927066803\n",
      "[proc 1][Train](253/100000) average loss: 0.5052558779716492\n",
      "[proc 1][Train](253/100000) average regularization: 6.329639745672466e-06\n",
      "[proc 1][Train] 1 steps take 2.684 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.432, backward: 0.071, update: 2.180\n",
      "[proc 0][Train](254/100000) average pos_loss: 0.7438844442367554\n",
      "[proc 0][Train](254/100000) average neg_loss: 0.5476478934288025\n",
      "[proc 0][Train](254/100000) average loss: 0.6457661390304565\n",
      "[proc 0][Train](254/100000) average regularization: 6.0983634284639265e-06\n",
      "[proc 0][Train] 1 steps take 2.635 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.433, backward: 0.070, update: 2.130\n",
      "[proc 1][Train](254/100000) average pos_loss: 0.7857645750045776\n",
      "[proc 1][Train](254/100000) average neg_loss: 0.5963612794876099\n",
      "[proc 1][Train](254/100000) average loss: 0.6910629272460938\n",
      "[proc 1][Train](254/100000) average regularization: 5.865484581590863e-06\n",
      "[proc 1][Train] 1 steps take 2.670 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.437, backward: 0.070, update: 2.162\n",
      "[proc 0][Train](255/100000) average pos_loss: 0.78753262758255\n",
      "[proc 0][Train](255/100000) average neg_loss: 0.2931334972381592\n",
      "[proc 0][Train](255/100000) average loss: 0.5403330326080322\n",
      "[proc 0][Train](255/100000) average regularization: 6.383948857546784e-06\n",
      "[proc 0][Train] 1 steps take 2.589 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.433, backward: 0.069, update: 2.085\n",
      "[proc 1][Train](255/100000) average pos_loss: 0.7390681505203247\n",
      "[proc 1][Train](255/100000) average neg_loss: 0.3182832598686218\n",
      "[proc 1][Train](255/100000) average loss: 0.5286756753921509\n",
      "[proc 1][Train](255/100000) average regularization: 6.2058784351393115e-06\n",
      "[proc 1][Train] 1 steps take 2.695 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.431, backward: 0.070, update: 2.193\n",
      "[proc 0][Train](256/100000) average pos_loss: 0.7258205413818359\n",
      "[proc 0][Train](256/100000) average neg_loss: 0.6258020997047424\n",
      "[proc 0][Train](256/100000) average loss: 0.6758112907409668\n",
      "[proc 0][Train](256/100000) average regularization: 6.085842414904619e-06\n",
      "[proc 0][Train] 1 steps take 2.643 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.415, backward: 0.070, update: 2.157\n",
      "[proc 1][Train](256/100000) average pos_loss: 0.7576065063476562\n",
      "[proc 1][Train](256/100000) average neg_loss: 0.5708286762237549\n",
      "[proc 1][Train](256/100000) average loss: 0.6642175912857056\n",
      "[proc 1][Train](256/100000) average regularization: 6.343201675917953e-06\n",
      "[proc 1][Train] 1 steps take 2.682 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.408, backward: 0.070, update: 2.202\n",
      "[proc 0][Train](257/100000) average pos_loss: 0.7136205434799194\n",
      "[proc 0][Train](257/100000) average neg_loss: 0.3173655867576599\n",
      "[proc 0][Train](257/100000) average loss: 0.5154930353164673\n",
      "[proc 0][Train](257/100000) average regularization: 6.320359261735575e-06\n",
      "[proc 0][Train] 1 steps take 2.713 seconds\n",
      "[proc 0]sample: 0.017, forward: 0.458, backward: 0.070, update: 2.169\n",
      "[proc 1][Train](257/100000) average pos_loss: 0.7403525710105896\n",
      "[proc 1][Train](257/100000) average neg_loss: 0.3154330253601074\n",
      "[proc 1][Train](257/100000) average loss: 0.5278928279876709\n",
      "[proc 1][Train](257/100000) average regularization: 6.137064247013768e-06\n",
      "[proc 1][Train] 1 steps take 2.804 seconds\n",
      "[proc 1]sample: 0.017, forward: 0.427, backward: 0.069, update: 2.290\n",
      "[proc 0][Train](258/100000) average pos_loss: 0.7187238931655884\n",
      "[proc 0][Train](258/100000) average neg_loss: 0.6205499172210693\n",
      "[proc 0][Train](258/100000) average loss: 0.6696369051933289\n",
      "[proc 0][Train](258/100000) average regularization: 6.26698511041468e-06\n",
      "[proc 0][Train] 1 steps take 2.742 seconds\n",
      "[proc 0]sample: 0.020, forward: 0.439, backward: 0.069, update: 2.213\n",
      "[proc 1][Train](258/100000) average pos_loss: 0.7510755062103271\n",
      "[proc 1][Train](258/100000) average neg_loss: 0.5705677270889282\n",
      "[proc 1][Train](258/100000) average loss: 0.6608216166496277\n",
      "[proc 1][Train](258/100000) average regularization: 6.250783826544648e-06\n",
      "[proc 1][Train] 1 steps take 2.706 seconds\n",
      "[proc 1]sample: 0.017, forward: 0.422, backward: 0.070, update: 2.196\n",
      "[proc 0][Train](259/100000) average pos_loss: 0.7571423053741455\n",
      "[proc 0][Train](259/100000) average neg_loss: 0.29249244928359985\n",
      "[proc 0][Train](259/100000) average loss: 0.5248173475265503\n",
      "[proc 0][Train](259/100000) average regularization: 6.356451194733381e-06\n",
      "[proc 0][Train] 1 steps take 2.699 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.429, backward: 0.070, update: 2.199\n",
      "[proc 1][Train](259/100000) average pos_loss: 0.7590689659118652\n",
      "[proc 1][Train](259/100000) average neg_loss: 0.30762457847595215\n",
      "[proc 1][Train](259/100000) average loss: 0.5333467721939087\n",
      "[proc 1][Train](259/100000) average regularization: 6.063039108994417e-06\n",
      "[proc 1][Train] 1 steps take 2.700 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.489, backward: 0.070, update: 2.139\n",
      "[proc 0][Train](260/100000) average pos_loss: 0.7595798969268799\n",
      "[proc 0][Train](260/100000) average neg_loss: 0.5614526867866516\n",
      "[proc 0][Train](260/100000) average loss: 0.6605162620544434\n",
      "[proc 0][Train](260/100000) average regularization: 6.321558430499863e-06\n",
      "[proc 0][Train] 1 steps take 2.733 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.434, backward: 0.070, update: 2.228\n",
      "[proc 1][Train](260/100000) average pos_loss: 0.7648177146911621\n",
      "[proc 1][Train](260/100000) average neg_loss: 0.5501664280891418\n",
      "[proc 1][Train](260/100000) average loss: 0.6574920415878296\n",
      "[proc 1][Train](260/100000) average regularization: 6.420253157557454e-06\n",
      "[proc 1][Train] 1 steps take 2.827 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.436, backward: 0.071, update: 2.318\n",
      "[proc 0][Train](261/100000) average pos_loss: 0.7463226318359375\n",
      "[proc 0][Train](261/100000) average neg_loss: 0.307236909866333\n",
      "[proc 0][Train](261/100000) average loss: 0.5267797708511353\n",
      "[proc 0][Train](261/100000) average regularization: 6.522882358694915e-06\n",
      "[proc 0][Train] 1 steps take 2.780 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.439, backward: 0.069, update: 2.270\n",
      "[proc 1][Train](261/100000) average pos_loss: 0.7335810661315918\n",
      "[proc 1][Train](261/100000) average neg_loss: 0.3353866934776306\n",
      "[proc 1][Train](261/100000) average loss: 0.5344839096069336\n",
      "[proc 1][Train](261/100000) average regularization: 6.178994226502255e-06\n",
      "[proc 1][Train] 1 steps take 2.734 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.436, backward: 0.070, update: 2.227\n",
      "[proc 0][Train](262/100000) average pos_loss: 0.6985883712768555\n",
      "[proc 0][Train](262/100000) average neg_loss: 0.6101190447807312\n",
      "[proc 0][Train](262/100000) average loss: 0.6543537378311157\n",
      "[proc 0][Train](262/100000) average regularization: 6.278330147324596e-06\n",
      "[proc 0][Train] 1 steps take 2.734 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.443, backward: 0.070, update: 2.219\n",
      "[proc 1][Train](262/100000) average pos_loss: 0.73863685131073\n",
      "[proc 1][Train](262/100000) average neg_loss: 0.5292952656745911\n",
      "[proc 1][Train](262/100000) average loss: 0.6339660882949829\n",
      "[proc 1][Train](262/100000) average regularization: 6.328501967800548e-06\n",
      "[proc 1][Train] 1 steps take 2.809 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.412, backward: 0.069, update: 2.326\n",
      "[proc 0][Train](263/100000) average pos_loss: 0.780224621295929\n",
      "[proc 0][Train](263/100000) average neg_loss: 0.308176726102829\n",
      "[proc 0][Train](263/100000) average loss: 0.5442006587982178\n",
      "[proc 0][Train](263/100000) average regularization: 6.399395715561695e-06\n",
      "[proc 0][Train] 1 steps take 2.670 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.403, backward: 0.070, update: 2.196\n",
      "[proc 1][Train](263/100000) average pos_loss: 0.7382413148880005\n",
      "[proc 1][Train](263/100000) average neg_loss: 0.3269777297973633\n",
      "[proc 1][Train](263/100000) average loss: 0.5326095223426819\n",
      "[proc 1][Train](263/100000) average regularization: 6.604314421565505e-06\n",
      "[proc 1][Train] 1 steps take 2.608 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.438, backward: 0.069, update: 2.100\n",
      "[proc 0][Train](264/100000) average pos_loss: 0.7222611308097839\n",
      "[proc 0][Train](264/100000) average neg_loss: 0.5507327318191528\n",
      "[proc 0][Train](264/100000) average loss: 0.636496901512146\n",
      "[proc 0][Train](264/100000) average regularization: 6.092549028835492e-06\n",
      "[proc 0][Train] 1 steps take 2.716 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.441, backward: 0.070, update: 2.203\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 1][Train](264/100000) average pos_loss: 0.7385340332984924\n",
      "[proc 1][Train](264/100000) average neg_loss: 0.5141216516494751\n",
      "[proc 1][Train](264/100000) average loss: 0.6263278722763062\n",
      "[proc 1][Train](264/100000) average regularization: 6.313614903774578e-06\n",
      "[proc 1][Train] 1 steps take 2.705 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.423, backward: 0.070, update: 2.211\n",
      "[proc 0][Train](265/100000) average pos_loss: 0.7220503091812134\n",
      "[proc 0][Train](265/100000) average neg_loss: 0.28344234824180603\n",
      "[proc 0][Train](265/100000) average loss: 0.5027463436126709\n",
      "[proc 0][Train](265/100000) average regularization: 6.424694220186211e-06\n",
      "[proc 0][Train] 1 steps take 2.689 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.444, backward: 0.070, update: 2.173\n",
      "[proc 1][Train](265/100000) average pos_loss: 0.6900089979171753\n",
      "[proc 1][Train](265/100000) average neg_loss: 0.3384358286857605\n",
      "[proc 1][Train](265/100000) average loss: 0.5142223834991455\n",
      "[proc 1][Train](265/100000) average regularization: 6.285968083830085e-06\n",
      "[proc 1][Train] 1 steps take 2.768 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.434, backward: 0.069, update: 2.263\n",
      "[proc 0][Train](266/100000) average pos_loss: 0.7057133316993713\n",
      "[proc 0][Train](266/100000) average neg_loss: 0.5663574934005737\n",
      "[proc 0][Train](266/100000) average loss: 0.6360354423522949\n",
      "[proc 0][Train](266/100000) average regularization: 6.304344424279407e-06\n",
      "[proc 0][Train] 1 steps take 2.803 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.446, backward: 0.069, update: 2.286\n",
      "[proc 1][Train](266/100000) average pos_loss: 0.7273827791213989\n",
      "[proc 1][Train](266/100000) average neg_loss: 0.5877630710601807\n",
      "[proc 1][Train](266/100000) average loss: 0.6575729250907898\n",
      "[proc 1][Train](266/100000) average regularization: 6.307322109933011e-06\n",
      "[proc 1][Train] 1 steps take 2.823 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.419, backward: 0.069, update: 2.334\n",
      "[proc 0][Train](267/100000) average pos_loss: 0.7657589912414551\n",
      "[proc 0][Train](267/100000) average neg_loss: 0.29213303327560425\n",
      "[proc 0][Train](267/100000) average loss: 0.528946042060852\n",
      "[proc 0][Train](267/100000) average regularization: 6.428394954127725e-06\n",
      "[proc 0][Train] 1 steps take 2.811 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.444, backward: 0.069, update: 2.297\n",
      "[proc 1][Train](267/100000) average pos_loss: 0.7030478119850159\n",
      "[proc 1][Train](267/100000) average neg_loss: 0.3261989951133728\n",
      "[proc 1][Train](267/100000) average loss: 0.5146234035491943\n",
      "[proc 1][Train](267/100000) average regularization: 6.352579021040583e-06\n",
      "[proc 1][Train] 1 steps take 2.768 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.424, backward: 0.070, update: 2.273\n",
      "[proc 0][Train](268/100000) average pos_loss: 0.7458993792533875\n",
      "[proc 0][Train](268/100000) average neg_loss: 0.6039299964904785\n",
      "[proc 0][Train](268/100000) average loss: 0.6749147176742554\n",
      "[proc 0][Train](268/100000) average regularization: 6.205650606716517e-06\n",
      "[proc 0][Train] 1 steps take 2.752 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.447, backward: 0.070, update: 2.234\n",
      "[proc 1][Train](268/100000) average pos_loss: 0.6994509696960449\n",
      "[proc 1][Train](268/100000) average neg_loss: 0.5219074487686157\n",
      "[proc 1][Train](268/100000) average loss: 0.6106792092323303\n",
      "[proc 1][Train](268/100000) average regularization: 6.28948464509449e-06\n",
      "[proc 1][Train] 1 steps take 2.765 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.423, backward: 0.070, update: 2.271\n",
      "[proc 0][Train](269/100000) average pos_loss: 0.773484468460083\n",
      "[proc 0][Train](269/100000) average neg_loss: 0.3012862801551819\n",
      "[proc 0][Train](269/100000) average loss: 0.5373853445053101\n",
      "[proc 0][Train](269/100000) average regularization: 6.112918072176399e-06\n",
      "[proc 0][Train] 1 steps take 2.753 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.456, backward: 0.070, update: 2.226\n",
      "[proc 1][Train](269/100000) average pos_loss: 0.7360628247261047\n",
      "[proc 1][Train](269/100000) average neg_loss: 0.3225182294845581\n",
      "[proc 1][Train](269/100000) average loss: 0.5292905569076538\n",
      "[proc 1][Train](269/100000) average regularization: 6.197947186592501e-06\n",
      "[proc 1][Train] 1 steps take 2.723 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.429, backward: 0.070, update: 2.223\n",
      "[proc 0][Train](270/100000) average pos_loss: 0.7006725072860718\n",
      "[proc 0][Train](270/100000) average neg_loss: 0.5521641373634338\n",
      "[proc 0][Train](270/100000) average loss: 0.6264183521270752\n",
      "[proc 0][Train](270/100000) average regularization: 6.2393705775320996e-06\n",
      "[proc 0][Train] 1 steps take 2.688 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.430, backward: 0.070, update: 2.186\n",
      "[proc 1][Train](270/100000) average pos_loss: 0.739831805229187\n",
      "[proc 1][Train](270/100000) average neg_loss: 0.5672817826271057\n",
      "[proc 1][Train](270/100000) average loss: 0.6535568237304688\n",
      "[proc 1][Train](270/100000) average regularization: 6.428430424421094e-06\n",
      "[proc 1][Train] 1 steps take 2.579 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.422, backward: 0.069, update: 2.086\n",
      "[proc 0][Train](271/100000) average pos_loss: 0.7555539608001709\n",
      "[proc 0][Train](271/100000) average neg_loss: 0.2837837040424347\n",
      "[proc 0][Train](271/100000) average loss: 0.5196688175201416\n",
      "[proc 0][Train](271/100000) average regularization: 6.420688805519603e-06\n",
      "[proc 0][Train] 1 steps take 2.763 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.449, backward: 0.070, update: 2.242\n",
      "[proc 1][Train](271/100000) average pos_loss: 0.7091025710105896\n",
      "[proc 1][Train](271/100000) average neg_loss: 0.311082124710083\n",
      "[proc 1][Train](271/100000) average loss: 0.5100923776626587\n",
      "[proc 1][Train](271/100000) average regularization: 6.306759587459965e-06\n",
      "[proc 1][Train] 1 steps take 2.653 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.429, backward: 0.070, update: 2.153\n",
      "[proc 0][Train](272/100000) average pos_loss: 0.7518834471702576\n",
      "[proc 0][Train](272/100000) average neg_loss: 0.5744518637657166\n",
      "[proc 0][Train](272/100000) average loss: 0.6631676554679871\n",
      "[proc 0][Train](272/100000) average regularization: 6.469329946412472e-06\n",
      "[proc 0][Train] 1 steps take 2.628 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.415, backward: 0.070, update: 2.143\n",
      "[proc 1][Train](272/100000) average pos_loss: 0.7206950187683105\n",
      "[proc 1][Train](272/100000) average neg_loss: 0.5677417516708374\n",
      "[proc 1][Train](272/100000) average loss: 0.644218385219574\n",
      "[proc 1][Train](272/100000) average regularization: 6.416371434170287e-06\n",
      "[proc 1][Train] 1 steps take 2.735 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.432, backward: 0.070, update: 2.232\n",
      "[proc 0][Train](273/100000) average pos_loss: 0.7231329679489136\n",
      "[proc 0][Train](273/100000) average neg_loss: 0.3406814634799957\n",
      "[proc 0][Train](273/100000) average loss: 0.5319072008132935\n",
      "[proc 0][Train](273/100000) average regularization: 6.498737548099598e-06\n",
      "[proc 0][Train] 1 steps take 2.737 seconds\n",
      "[proc 0]sample: 0.018, forward: 0.435, backward: 0.070, update: 2.215\n",
      "[proc 1][Train](273/100000) average pos_loss: 0.6954741477966309\n",
      "[proc 1][Train](273/100000) average neg_loss: 0.307364284992218\n",
      "[proc 1][Train](273/100000) average loss: 0.501419186592102\n",
      "[proc 1][Train](273/100000) average regularization: 6.458754796767607e-06\n",
      "[proc 1][Train] 1 steps take 2.833 seconds\n",
      "[proc 1]sample: 0.016, forward: 0.429, backward: 0.070, update: 2.317\n",
      "[proc 0][Train](274/100000) average pos_loss: 0.6790857315063477\n",
      "[proc 0][Train](274/100000) average neg_loss: 0.5681837797164917\n",
      "[proc 0][Train](274/100000) average loss: 0.6236347556114197\n",
      "[proc 0][Train](274/100000) average regularization: 6.59589386486914e-06\n",
      "[proc 0][Train] 1 steps take 2.767 seconds\n",
      "[proc 0]sample: 0.019, forward: 0.435, backward: 0.070, update: 2.244\n",
      "[proc 1][Train](274/100000) average pos_loss: 0.7155433893203735\n",
      "[proc 1][Train](274/100000) average neg_loss: 0.5437649488449097\n",
      "[proc 1][Train](274/100000) average loss: 0.6296541690826416\n",
      "[proc 1][Train](274/100000) average regularization: 6.2940835050540045e-06\n",
      "[proc 1][Train] 1 steps take 2.747 seconds\n",
      "[proc 1]sample: 0.015, forward: 0.435, backward: 0.070, update: 2.227\n",
      "[proc 0][Train](275/100000) average pos_loss: 0.7567789554595947\n",
      "[proc 0][Train](275/100000) average neg_loss: 0.30531036853790283\n",
      "[proc 0][Train](275/100000) average loss: 0.5310446619987488\n",
      "[proc 0][Train](275/100000) average regularization: 6.3471338762610685e-06\n",
      "[proc 0][Train] 1 steps take 2.771 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.446, backward: 0.069, update: 2.255\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 1][Train](275/100000) average pos_loss: 0.6881613731384277\n",
      "[proc 1][Train](275/100000) average neg_loss: 0.3154081106185913\n",
      "[proc 1][Train](275/100000) average loss: 0.5017847418785095\n",
      "[proc 1][Train](275/100000) average regularization: 6.25842312729219e-06\n",
      "[proc 1][Train] 1 steps take 2.711 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.426, backward: 0.070, update: 2.213\n",
      "[proc 0][Train](276/100000) average pos_loss: 0.7072272896766663\n",
      "[proc 0][Train](276/100000) average neg_loss: 0.533738374710083\n",
      "[proc 0][Train](276/100000) average loss: 0.6204828023910522\n",
      "[proc 0][Train](276/100000) average regularization: 6.695125648548128e-06\n",
      "[proc 0][Train] 1 steps take 2.737 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.447, backward: 0.069, update: 2.219\n",
      "[proc 1][Train](276/100000) average pos_loss: 0.7142603397369385\n",
      "[proc 1][Train](276/100000) average neg_loss: 0.5543946623802185\n",
      "[proc 1][Train](276/100000) average loss: 0.6343275308609009\n",
      "[proc 1][Train](276/100000) average regularization: 6.375906650646357e-06\n",
      "[proc 1][Train] 1 steps take 2.767 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.447, backward: 0.070, update: 2.249\n",
      "[proc 0][Train](277/100000) average pos_loss: 0.7434340119361877\n",
      "[proc 0][Train](277/100000) average neg_loss: 0.2946818470954895\n",
      "[proc 0][Train](277/100000) average loss: 0.5190579295158386\n",
      "[proc 0][Train](277/100000) average regularization: 6.251427748793503e-06\n",
      "[proc 0][Train] 1 steps take 2.692 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.434, backward: 0.069, update: 2.187\n",
      "[proc 1][Train](277/100000) average pos_loss: 0.7157315015792847\n",
      "[proc 1][Train](277/100000) average neg_loss: 0.3129656910896301\n",
      "[proc 1][Train](277/100000) average loss: 0.5143486261367798\n",
      "[proc 1][Train](277/100000) average regularization: 6.572848633368267e-06\n",
      "[proc 1][Train] 1 steps take 2.673 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.419, backward: 0.069, update: 2.183\n",
      "[proc 0][Train](278/100000) average pos_loss: 0.6922653913497925\n",
      "[proc 0][Train](278/100000) average neg_loss: 0.6044538021087646\n",
      "[proc 0][Train](278/100000) average loss: 0.6483595967292786\n",
      "[proc 0][Train](278/100000) average regularization: 6.668103196716402e-06\n",
      "[proc 0][Train] 1 steps take 2.625 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.432, backward: 0.069, update: 2.122\n",
      "[proc 1][Train](278/100000) average pos_loss: 0.7347855567932129\n",
      "[proc 1][Train](278/100000) average neg_loss: 0.545021116733551\n",
      "[proc 1][Train](278/100000) average loss: 0.6399033069610596\n",
      "[proc 1][Train](278/100000) average regularization: 6.513465905300109e-06\n",
      "[proc 1][Train] 1 steps take 2.773 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.423, backward: 0.070, update: 2.279\n",
      "[proc 0][Train](279/100000) average pos_loss: 0.7006848454475403\n",
      "[proc 0][Train](279/100000) average neg_loss: 0.29648658633232117\n",
      "[proc 0][Train](279/100000) average loss: 0.49858570098876953\n",
      "[proc 0][Train](279/100000) average regularization: 6.5197350522794295e-06\n",
      "[proc 0][Train] 1 steps take 3.036 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.452, backward: 0.069, update: 2.514\n",
      "[proc 1][Train](279/100000) average pos_loss: 0.6969178318977356\n",
      "[proc 1][Train](279/100000) average neg_loss: 0.32793110609054565\n",
      "[proc 1][Train](279/100000) average loss: 0.5124244689941406\n",
      "[proc 1][Train](279/100000) average regularization: 6.460829354182351e-06\n",
      "[proc 1][Train] 1 steps take 2.642 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.405, backward: 0.069, update: 2.167\n",
      "[proc 0][Train](280/100000) average pos_loss: 0.7253017425537109\n",
      "[proc 0][Train](280/100000) average neg_loss: 0.5545194745063782\n",
      "[proc 0][Train](280/100000) average loss: 0.6399105787277222\n",
      "[proc 0][Train](280/100000) average regularization: 6.281741207203595e-06\n",
      "[proc 0][Train] 1 steps take 2.588 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.428, backward: 0.070, update: 2.090\n",
      "[proc 1][Train](280/100000) average pos_loss: 0.693816065788269\n",
      "[proc 1][Train](280/100000) average neg_loss: 0.5779414176940918\n",
      "[proc 1][Train](280/100000) average loss: 0.6358787417411804\n",
      "[proc 1][Train](280/100000) average regularization: 6.397631750587607e-06\n",
      "[proc 1][Train] 1 steps take 2.658 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.432, backward: 0.069, update: 2.156\n",
      "[proc 0][Train](281/100000) average pos_loss: 0.7056472897529602\n",
      "[proc 0][Train](281/100000) average neg_loss: 0.31520798802375793\n",
      "[proc 0][Train](281/100000) average loss: 0.5104276537895203\n",
      "[proc 0][Train](281/100000) average regularization: 6.495451998489443e-06\n",
      "[proc 0][Train] 1 steps take 2.710 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.433, backward: 0.070, update: 2.205\n",
      "[proc 1][Train](281/100000) average pos_loss: 0.7264406681060791\n",
      "[proc 1][Train](281/100000) average neg_loss: 0.2957920432090759\n",
      "[proc 1][Train](281/100000) average loss: 0.5111163854598999\n",
      "[proc 1][Train](281/100000) average regularization: 6.4932442001008894e-06\n",
      "[proc 1][Train] 1 steps take 2.652 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.429, backward: 0.069, update: 2.152\n",
      "[proc 0][Train](282/100000) average pos_loss: 0.6799689531326294\n",
      "[proc 0][Train](282/100000) average neg_loss: 0.602326512336731\n",
      "[proc 0][Train](282/100000) average loss: 0.6411477327346802\n",
      "[proc 0][Train](282/100000) average regularization: 6.3694874370412435e-06\n",
      "[proc 0][Train] 1 steps take 2.743 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.460, backward: 0.070, update: 2.211\n",
      "[proc 1][Train](282/100000) average pos_loss: 0.7267781496047974\n",
      "[proc 1][Train](282/100000) average neg_loss: 0.5312484502792358\n",
      "[proc 1][Train](282/100000) average loss: 0.6290132999420166\n",
      "[proc 1][Train](282/100000) average regularization: 6.620171461690916e-06\n",
      "[proc 1][Train] 1 steps take 2.627 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.431, backward: 0.070, update: 2.125\n",
      "[proc 0][Train](283/100000) average pos_loss: 0.7549871206283569\n",
      "[proc 0][Train](283/100000) average neg_loss: 0.2862081229686737\n",
      "[proc 0][Train](283/100000) average loss: 0.5205976366996765\n",
      "[proc 0][Train](283/100000) average regularization: 6.7383307396085e-06\n",
      "[proc 0][Train] 1 steps take 2.606 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.424, backward: 0.069, update: 2.110\n",
      "[proc 1][Train](283/100000) average pos_loss: 0.7344896793365479\n",
      "[proc 1][Train](283/100000) average neg_loss: 0.28920692205429077\n",
      "[proc 1][Train](283/100000) average loss: 0.5118483304977417\n",
      "[proc 1][Train](283/100000) average regularization: 6.56839756629779e-06\n",
      "[proc 1][Train] 1 steps take 2.646 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.430, backward: 0.069, update: 2.146\n",
      "[proc 0][Train](284/100000) average pos_loss: 0.6846305131912231\n",
      "[proc 0][Train](284/100000) average neg_loss: 0.5614680051803589\n",
      "[proc 0][Train](284/100000) average loss: 0.623049259185791\n",
      "[proc 0][Train](284/100000) average regularization: 6.4219134401355404e-06\n",
      "[proc 0][Train] 1 steps take 2.624 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.434, backward: 0.070, update: 2.118\n",
      "[proc 1][Train](284/100000) average pos_loss: 0.7007147073745728\n",
      "[proc 1][Train](284/100000) average neg_loss: 0.5983583927154541\n",
      "[proc 1][Train](284/100000) average loss: 0.6495365500450134\n",
      "[proc 1][Train](284/100000) average regularization: 6.593475973204477e-06\n",
      "[proc 1][Train] 1 steps take 2.640 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.427, backward: 0.071, update: 2.141\n",
      "[proc 0][Train](285/100000) average pos_loss: 0.7191237211227417\n",
      "[proc 0][Train](285/100000) average neg_loss: 0.2954583466053009\n",
      "[proc 0][Train](285/100000) average loss: 0.5072910189628601\n",
      "[proc 0][Train](285/100000) average regularization: 6.490441137430025e-06\n",
      "[proc 0][Train] 1 steps take 2.648 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.436, backward: 0.071, update: 2.139\n",
      "[proc 1][Train](285/100000) average pos_loss: 0.7178454399108887\n",
      "[proc 1][Train](285/100000) average neg_loss: 0.35261183977127075\n",
      "[proc 1][Train](285/100000) average loss: 0.5352286100387573\n",
      "[proc 1][Train](285/100000) average regularization: 6.350174771796446e-06\n",
      "[proc 1][Train] 1 steps take 2.642 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.429, backward: 0.069, update: 2.142\n",
      "[proc 0][Train](286/100000) average pos_loss: 0.6666997075080872\n",
      "[proc 0][Train](286/100000) average neg_loss: 0.5470577478408813\n",
      "[proc 0][Train](286/100000) average loss: 0.6068787574768066\n",
      "[proc 0][Train](286/100000) average regularization: 6.823355761298444e-06\n",
      "[proc 0][Train] 1 steps take 2.720 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.427, backward: 0.071, update: 2.220\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 1][Train](286/100000) average pos_loss: 0.719488263130188\n",
      "[proc 1][Train](286/100000) average neg_loss: 0.5863946676254272\n",
      "[proc 1][Train](286/100000) average loss: 0.6529414653778076\n",
      "[proc 1][Train](286/100000) average regularization: 6.52147446089657e-06\n",
      "[proc 1][Train] 1 steps take 2.786 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.430, backward: 0.070, update: 2.284\n",
      "[proc 0][Train](287/100000) average pos_loss: 0.7056901454925537\n",
      "[proc 0][Train](287/100000) average neg_loss: 0.28247150778770447\n",
      "[proc 0][Train](287/100000) average loss: 0.4940808415412903\n",
      "[proc 0][Train](287/100000) average regularization: 7.027064384601545e-06\n",
      "[proc 0][Train] 1 steps take 2.797 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.383, backward: 0.069, update: 2.343\n",
      "[proc 1][Train](287/100000) average pos_loss: 0.6964645385742188\n",
      "[proc 1][Train](287/100000) average neg_loss: 0.29380637407302856\n",
      "[proc 1][Train](287/100000) average loss: 0.49513545632362366\n",
      "[proc 1][Train](287/100000) average regularization: 6.736081559211016e-06\n",
      "[proc 1][Train] 1 steps take 2.681 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.416, backward: 0.070, update: 2.194\n",
      "[proc 0][Train](288/100000) average pos_loss: 0.6785346269607544\n",
      "[proc 0][Train](288/100000) average neg_loss: 0.5630278587341309\n",
      "[proc 0][Train](288/100000) average loss: 0.6207812428474426\n",
      "[proc 0][Train](288/100000) average regularization: 6.436578587454278e-06\n",
      "[proc 0][Train] 1 steps take 2.699 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.426, backward: 0.070, update: 2.201\n",
      "[proc 1][Train](288/100000) average pos_loss: 0.7323969602584839\n",
      "[proc 1][Train](288/100000) average neg_loss: 0.5313398838043213\n",
      "[proc 1][Train](288/100000) average loss: 0.6318684220314026\n",
      "[proc 1][Train](288/100000) average regularization: 6.5592553255555686e-06\n",
      "[proc 1][Train] 1 steps take 2.745 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.432, backward: 0.069, update: 2.242\n",
      "[proc 0][Train](289/100000) average pos_loss: 0.6901620626449585\n",
      "[proc 0][Train](289/100000) average neg_loss: 0.30874329805374146\n",
      "[proc 0][Train](289/100000) average loss: 0.49945268034935\n",
      "[proc 0][Train](289/100000) average regularization: 6.707409738737624e-06\n",
      "[proc 0][Train] 1 steps take 2.706 seconds\n",
      "[proc 0]sample: 0.017, forward: 0.425, backward: 0.071, update: 2.192\n",
      "[proc 1][Train](289/100000) average pos_loss: 0.729235827922821\n",
      "[proc 1][Train](289/100000) average neg_loss: 0.3287120461463928\n",
      "[proc 1][Train](289/100000) average loss: 0.5289739370346069\n",
      "[proc 1][Train](289/100000) average regularization: 6.580385615961859e-06\n",
      "[proc 1][Train] 1 steps take 2.654 seconds\n",
      "[proc 1]sample: 0.014, forward: 0.400, backward: 0.070, update: 2.171\n",
      "[proc 0][Train](290/100000) average pos_loss: 0.7068177461624146\n",
      "[proc 0][Train](290/100000) average neg_loss: 0.5947784781455994\n",
      "[proc 0][Train](290/100000) average loss: 0.6507980823516846\n",
      "[proc 0][Train](290/100000) average regularization: 6.401607606676407e-06\n",
      "[proc 0][Train] 1 steps take 2.679 seconds\n",
      "[proc 0]sample: 0.017, forward: 0.421, backward: 0.071, update: 2.171\n",
      "[proc 1][Train](290/100000) average pos_loss: 0.7200033068656921\n",
      "[proc 1][Train](290/100000) average neg_loss: 0.5427834987640381\n",
      "[proc 1][Train](290/100000) average loss: 0.6313934326171875\n",
      "[proc 1][Train](290/100000) average regularization: 6.70579674988403e-06\n",
      "[proc 1][Train] 1 steps take 2.750 seconds\n",
      "[proc 1]sample: 0.016, forward: 0.434, backward: 0.070, update: 2.229\n",
      "[proc 0][Train](291/100000) average pos_loss: 0.6937997937202454\n",
      "[proc 0][Train](291/100000) average neg_loss: 0.3014007806777954\n",
      "[proc 0][Train](291/100000) average loss: 0.4976002871990204\n",
      "[proc 0][Train](291/100000) average regularization: 6.848181783425389e-06\n",
      "[proc 0][Train] 1 steps take 2.678 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.430, backward: 0.071, update: 2.176\n",
      "[proc 1][Train](291/100000) average pos_loss: 0.7042751312255859\n",
      "[proc 1][Train](291/100000) average neg_loss: 0.31653261184692383\n",
      "[proc 1][Train](291/100000) average loss: 0.5104038715362549\n",
      "[proc 1][Train](291/100000) average regularization: 6.443618985940702e-06\n",
      "[proc 1][Train] 1 steps take 2.676 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.433, backward: 0.069, update: 2.171\n",
      "[proc 0][Train](292/100000) average pos_loss: 0.6711512804031372\n",
      "[proc 0][Train](292/100000) average neg_loss: 0.5386055707931519\n",
      "[proc 0][Train](292/100000) average loss: 0.6048784255981445\n",
      "[proc 0][Train](292/100000) average regularization: 6.802814368711552e-06\n",
      "[proc 0][Train] 1 steps take 2.699 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.526, backward: 0.070, update: 2.102\n",
      "[proc 1][Train](292/100000) average pos_loss: 0.7146737575531006\n",
      "[proc 1][Train](292/100000) average neg_loss: 0.5394741296768188\n",
      "[proc 1][Train](292/100000) average loss: 0.6270739436149597\n",
      "[proc 1][Train](292/100000) average regularization: 6.575871339009609e-06\n",
      "[proc 1][Train] 1 steps take 2.661 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.425, backward: 0.070, update: 2.166\n",
      "[proc 0][Train](293/100000) average pos_loss: 0.6646403074264526\n",
      "[proc 0][Train](293/100000) average neg_loss: 0.2903352379798889\n",
      "[proc 0][Train](293/100000) average loss: 0.4774877727031708\n",
      "[proc 0][Train](293/100000) average regularization: 6.758298695785925e-06\n",
      "[proc 0][Train] 1 steps take 2.614 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.418, backward: 0.071, update: 2.124\n",
      "[proc 1][Train](293/100000) average pos_loss: 0.6975318789482117\n",
      "[proc 1][Train](293/100000) average neg_loss: 0.2840364873409271\n",
      "[proc 1][Train](293/100000) average loss: 0.4907841682434082\n",
      "[proc 1][Train](293/100000) average regularization: 6.4752161961223464e-06\n",
      "[proc 1][Train] 1 steps take 2.708 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.433, backward: 0.070, update: 2.203\n",
      "[proc 0][Train](294/100000) average pos_loss: 0.6414639949798584\n",
      "[proc 0][Train](294/100000) average neg_loss: 0.5859238505363464\n",
      "[proc 0][Train](294/100000) average loss: 0.6136939525604248\n",
      "[proc 0][Train](294/100000) average regularization: 6.682565981463995e-06\n",
      "[proc 0][Train] 1 steps take 2.608 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.429, backward: 0.070, update: 2.108\n",
      "[proc 1][Train](294/100000) average pos_loss: 0.6723923683166504\n",
      "[proc 1][Train](294/100000) average neg_loss: 0.6010149717330933\n",
      "[proc 1][Train](294/100000) average loss: 0.6367036700248718\n",
      "[proc 1][Train](294/100000) average regularization: 6.451703939092113e-06\n",
      "[proc 1][Train] 1 steps take 2.671 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.413, backward: 0.070, update: 2.187\n",
      "[proc 0][Train](295/100000) average pos_loss: 0.7472453713417053\n",
      "[proc 0][Train](295/100000) average neg_loss: 0.3177579641342163\n",
      "[proc 0][Train](295/100000) average loss: 0.5325016975402832\n",
      "[proc 0][Train](295/100000) average regularization: 6.921306066942634e-06\n",
      "[proc 0][Train] 1 steps take 2.710 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.434, backward: 0.069, update: 2.205\n",
      "[proc 1][Train](295/100000) average pos_loss: 0.6855685114860535\n",
      "[proc 1][Train](295/100000) average neg_loss: 0.30125582218170166\n",
      "[proc 1][Train](295/100000) average loss: 0.49341216683387756\n",
      "[proc 1][Train](295/100000) average regularization: 6.646180736424867e-06\n",
      "[proc 1][Train] 1 steps take 2.632 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.427, backward: 0.070, update: 2.134\n",
      "[proc 0][Train](296/100000) average pos_loss: 0.7109690308570862\n",
      "[proc 0][Train](296/100000) average neg_loss: 0.5649833679199219\n",
      "[proc 0][Train](296/100000) average loss: 0.6379761695861816\n",
      "[proc 0][Train](296/100000) average regularization: 6.823130661359755e-06\n",
      "[proc 0][Train] 1 steps take 2.578 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.429, backward: 0.070, update: 2.077\n",
      "[proc 1][Train](296/100000) average pos_loss: 0.7250563502311707\n",
      "[proc 1][Train](296/100000) average neg_loss: 0.5688265562057495\n",
      "[proc 1][Train](296/100000) average loss: 0.6469414234161377\n",
      "[proc 1][Train](296/100000) average regularization: 6.774321718694409e-06\n",
      "[proc 1][Train] 1 steps take 2.761 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.436, backward: 0.069, update: 2.254\n",
      "[proc 0][Train](297/100000) average pos_loss: 0.6861865520477295\n",
      "[proc 0][Train](297/100000) average neg_loss: 0.2992621064186096\n",
      "[proc 0][Train](297/100000) average loss: 0.49272432923316956\n",
      "[proc 0][Train](297/100000) average regularization: 6.599217158509418e-06\n",
      "[proc 0][Train] 1 steps take 2.698 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.455, backward: 0.070, update: 2.170\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 1][Train](297/100000) average pos_loss: 0.7052925229072571\n",
      "[proc 1][Train](297/100000) average neg_loss: 0.3155503273010254\n",
      "[proc 1][Train](297/100000) average loss: 0.5104213953018188\n",
      "[proc 1][Train](297/100000) average regularization: 6.583666163351154e-06\n",
      "[proc 1][Train] 1 steps take 2.601 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.439, backward: 0.071, update: 2.090\n",
      "[proc 0][Train](298/100000) average pos_loss: 0.7079145908355713\n",
      "[proc 0][Train](298/100000) average neg_loss: 0.5688315033912659\n",
      "[proc 0][Train](298/100000) average loss: 0.6383730173110962\n",
      "[proc 0][Train](298/100000) average regularization: 6.648529051744845e-06\n",
      "[proc 0][Train] 1 steps take 2.680 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.428, backward: 0.070, update: 2.180\n",
      "[proc 1][Train](298/100000) average pos_loss: 0.6960655450820923\n",
      "[proc 1][Train](298/100000) average neg_loss: 0.5648239850997925\n",
      "[proc 1][Train](298/100000) average loss: 0.6304447650909424\n",
      "[proc 1][Train](298/100000) average regularization: 6.7277032940182835e-06\n",
      "[proc 1][Train] 1 steps take 2.678 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.426, backward: 0.070, update: 2.180\n",
      "[proc 0][Train](299/100000) average pos_loss: 0.6899566054344177\n",
      "[proc 0][Train](299/100000) average neg_loss: 0.2924365699291229\n",
      "[proc 0][Train](299/100000) average loss: 0.49119657278060913\n",
      "[proc 0][Train](299/100000) average regularization: 6.857785137981409e-06\n",
      "[proc 0][Train] 1 steps take 2.590 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.439, backward: 0.070, update: 2.080\n",
      "[proc 1][Train](299/100000) average pos_loss: 0.6821494698524475\n",
      "[proc 1][Train](299/100000) average neg_loss: 0.2739018201828003\n",
      "[proc 1][Train](299/100000) average loss: 0.4780256450176239\n",
      "[proc 1][Train](299/100000) average regularization: 6.6219186010130215e-06\n",
      "[proc 1][Train] 1 steps take 2.700 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.435, backward: 0.070, update: 2.194\n",
      "[proc 0][Train](300/100000) average pos_loss: 0.6753231287002563\n",
      "[proc 0][Train](300/100000) average neg_loss: 0.5979434251785278\n",
      "[proc 0][Train](300/100000) average loss: 0.6366332769393921\n",
      "[proc 0][Train](300/100000) average regularization: 6.4910182118183e-06\n",
      "[proc 0][Train] 1 steps take 2.718 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.455, backward: 0.069, update: 2.191\n",
      "[proc 1][Train](300/100000) average pos_loss: 0.669323205947876\n",
      "[proc 1][Train](300/100000) average neg_loss: 0.5941454172134399\n",
      "[proc 1][Train](300/100000) average loss: 0.631734311580658\n",
      "[proc 1][Train](300/100000) average regularization: 6.778567694709636e-06\n",
      "[proc 1][Train] 1 steps take 2.742 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.422, backward: 0.070, update: 2.248\n",
      "[proc 0][Train](301/100000) average pos_loss: 0.6938006281852722\n",
      "[proc 0][Train](301/100000) average neg_loss: 0.3211957812309265\n",
      "[proc 0][Train](301/100000) average loss: 0.5074982047080994\n",
      "[proc 0][Train](301/100000) average regularization: 6.560302153957309e-06\n",
      "[proc 0][Train] 1 steps take 2.679 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.435, backward: 0.071, update: 2.171\n",
      "[proc 1][Train](301/100000) average pos_loss: 0.6920205354690552\n",
      "[proc 1][Train](301/100000) average neg_loss: 0.29978999495506287\n",
      "[proc 1][Train](301/100000) average loss: 0.4959052801132202\n",
      "[proc 1][Train](301/100000) average regularization: 6.6033821894961875e-06\n",
      "[proc 1][Train] 1 steps take 2.813 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.470, backward: 0.070, update: 2.272\n",
      "[proc 0][Train](302/100000) average pos_loss: 0.6711641550064087\n",
      "[proc 0][Train](302/100000) average neg_loss: 0.568448543548584\n",
      "[proc 0][Train](302/100000) average loss: 0.6198063492774963\n",
      "[proc 0][Train](302/100000) average regularization: 6.74419925417169e-06\n",
      "[proc 0][Train] 1 steps take 2.856 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.451, backward: 0.070, update: 2.333\n",
      "[proc 1][Train](302/100000) average pos_loss: 0.688828706741333\n",
      "[proc 1][Train](302/100000) average neg_loss: 0.5445176959037781\n",
      "[proc 1][Train](302/100000) average loss: 0.6166732311248779\n",
      "[proc 1][Train](302/100000) average regularization: 6.649000624747714e-06\n",
      "[proc 1][Train] 1 steps take 2.696 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.428, backward: 0.070, update: 2.196\n",
      "[proc 0][Train](303/100000) average pos_loss: 0.7044969797134399\n",
      "[proc 0][Train](303/100000) average neg_loss: 0.28841596841812134\n",
      "[proc 0][Train](303/100000) average loss: 0.49645647406578064\n",
      "[proc 0][Train](303/100000) average regularization: 6.559938810823951e-06\n",
      "[proc 0][Train] 1 steps take 2.663 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.409, backward: 0.070, update: 2.183\n",
      "[proc 1][Train](303/100000) average pos_loss: 0.6709805130958557\n",
      "[proc 1][Train](303/100000) average neg_loss: 0.32211655378341675\n",
      "[proc 1][Train](303/100000) average loss: 0.49654853343963623\n",
      "[proc 1][Train](303/100000) average regularization: 6.472726454376243e-06\n",
      "[proc 1][Train] 1 steps take 2.743 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.419, backward: 0.069, update: 2.253\n",
      "[proc 0][Train](304/100000) average pos_loss: 0.6822975873947144\n",
      "[proc 0][Train](304/100000) average neg_loss: 0.5292387008666992\n",
      "[proc 0][Train](304/100000) average loss: 0.6057681441307068\n",
      "[proc 0][Train](304/100000) average regularization: 6.72030409987201e-06\n",
      "[proc 0][Train] 1 steps take 2.734 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.435, backward: 0.070, update: 2.228\n",
      "[proc 1][Train](304/100000) average pos_loss: 0.6814917325973511\n",
      "[proc 1][Train](304/100000) average neg_loss: 0.5806999206542969\n",
      "[proc 1][Train](304/100000) average loss: 0.631095826625824\n",
      "[proc 1][Train](304/100000) average regularization: 6.87783813191345e-06\n",
      "[proc 1][Train] 1 steps take 2.912 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.434, backward: 0.070, update: 2.407\n",
      "[proc 0][Train](305/100000) average pos_loss: 0.7218354344367981\n",
      "[proc 0][Train](305/100000) average neg_loss: 0.29812929034233093\n",
      "[proc 0][Train](305/100000) average loss: 0.5099823474884033\n",
      "[proc 0][Train](305/100000) average regularization: 6.900341759319417e-06\n",
      "[proc 0][Train] 1 steps take 2.793 seconds\n",
      "[proc 0]sample: 0.014, forward: 0.446, backward: 0.070, update: 2.263\n",
      "[proc 1][Train](305/100000) average pos_loss: 0.6683229207992554\n",
      "[proc 1][Train](305/100000) average neg_loss: 0.30321580171585083\n",
      "[proc 1][Train](305/100000) average loss: 0.4857693612575531\n",
      "[proc 1][Train](305/100000) average regularization: 7.0341302489396185e-06\n",
      "[proc 1][Train] 1 steps take 2.860 seconds\n",
      "[proc 1]sample: 0.017, forward: 0.465, backward: 0.072, update: 2.305\n",
      "[proc 0][Train](306/100000) average pos_loss: 0.6808992624282837\n",
      "[proc 0][Train](306/100000) average neg_loss: 0.5804420709609985\n",
      "[proc 0][Train](306/100000) average loss: 0.6306706666946411\n",
      "[proc 0][Train](306/100000) average regularization: 6.891711564094294e-06\n",
      "[proc 0][Train] 1 steps take 2.875 seconds\n",
      "[proc 0]sample: 0.018, forward: 0.435, backward: 0.070, update: 2.351\n",
      "[proc 1][Train](306/100000) average pos_loss: 0.696628212928772\n",
      "[proc 1][Train](306/100000) average neg_loss: 0.5612716674804688\n",
      "[proc 1][Train](306/100000) average loss: 0.6289499402046204\n",
      "[proc 1][Train](306/100000) average regularization: 6.511048013635445e-06\n",
      "[proc 1][Train] 1 steps take 2.927 seconds\n",
      "[proc 1]sample: 0.020, forward: 0.441, backward: 0.069, update: 2.397\n",
      "[proc 0][Train](307/100000) average pos_loss: 0.7197386622428894\n",
      "[proc 0][Train](307/100000) average neg_loss: 0.3127874732017517\n",
      "[proc 0][Train](307/100000) average loss: 0.5162630677223206\n",
      "[proc 0][Train](307/100000) average regularization: 6.702906830469146e-06\n",
      "[proc 0][Train] 1 steps take 2.640 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.442, backward: 0.072, update: 2.124\n",
      "[proc 1][Train](307/100000) average pos_loss: 0.7099694013595581\n",
      "[proc 1][Train](307/100000) average neg_loss: 0.3218350410461426\n",
      "[proc 1][Train](307/100000) average loss: 0.5159022212028503\n",
      "[proc 1][Train](307/100000) average regularization: 6.6764710027200636e-06\n",
      "[proc 1][Train] 1 steps take 2.775 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.424, backward: 0.069, update: 2.280\n",
      "[proc 0][Train](308/100000) average pos_loss: 0.6891416311264038\n",
      "[proc 0][Train](308/100000) average neg_loss: 0.5485810041427612\n",
      "[proc 0][Train](308/100000) average loss: 0.6188613176345825\n",
      "[proc 0][Train](308/100000) average regularization: 6.572375696123345e-06\n",
      "[proc 0][Train] 1 steps take 2.811 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.428, backward: 0.071, update: 2.310\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 1][Train](308/100000) average pos_loss: 0.6853450536727905\n",
      "[proc 1][Train](308/100000) average neg_loss: 0.5402975082397461\n",
      "[proc 1][Train](308/100000) average loss: 0.6128212809562683\n",
      "[proc 1][Train](308/100000) average regularization: 6.7955093072669115e-06\n",
      "[proc 1][Train] 1 steps take 2.739 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.437, backward: 0.070, update: 2.230\n",
      "[proc 0][Train](309/100000) average pos_loss: 0.6931213736534119\n",
      "[proc 0][Train](309/100000) average neg_loss: 0.2882606089115143\n",
      "[proc 0][Train](309/100000) average loss: 0.49069100618362427\n",
      "[proc 0][Train](309/100000) average regularization: 6.768087132513756e-06\n",
      "[proc 0][Train] 1 steps take 2.816 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.468, backward: 0.070, update: 2.277\n",
      "[proc 1][Train](309/100000) average pos_loss: 0.655471920967102\n",
      "[proc 1][Train](309/100000) average neg_loss: 0.30078062415122986\n",
      "[proc 1][Train](309/100000) average loss: 0.47812628746032715\n",
      "[proc 1][Train](309/100000) average regularization: 7.015279607003322e-06\n",
      "[proc 1][Train] 1 steps take 2.876 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.467, backward: 0.069, update: 2.337\n",
      "[proc 0][Train](310/100000) average pos_loss: 0.6637473106384277\n",
      "[proc 0][Train](310/100000) average neg_loss: 0.568185567855835\n",
      "[proc 0][Train](310/100000) average loss: 0.6159664392471313\n",
      "[proc 0][Train](310/100000) average regularization: 6.849558303656522e-06\n",
      "[proc 0][Train] 1 steps take 2.721 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.444, backward: 0.070, update: 2.205\n",
      "[proc 1][Train](310/100000) average pos_loss: 0.6739903092384338\n",
      "[proc 1][Train](310/100000) average neg_loss: 0.5510813593864441\n",
      "[proc 1][Train](310/100000) average loss: 0.612535834312439\n",
      "[proc 1][Train](310/100000) average regularization: 6.644451786996797e-06\n",
      "[proc 1][Train] 1 steps take 2.783 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.429, backward: 0.070, update: 2.282\n",
      "[proc 0][Train](311/100000) average pos_loss: 0.6920129656791687\n",
      "[proc 0][Train](311/100000) average neg_loss: 0.3096868693828583\n",
      "[proc 0][Train](311/100000) average loss: 0.5008499026298523\n",
      "[proc 0][Train](311/100000) average regularization: 6.673648840660462e-06\n",
      "[proc 0][Train] 1 steps take 2.810 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.430, backward: 0.070, update: 2.309\n",
      "[proc 1][Train](311/100000) average pos_loss: 0.6531482338905334\n",
      "[proc 1][Train](311/100000) average neg_loss: 0.3014795780181885\n",
      "[proc 1][Train](311/100000) average loss: 0.47731390595436096\n",
      "[proc 1][Train](311/100000) average regularization: 6.842325092293322e-06\n",
      "[proc 1][Train] 1 steps take 2.823 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.463, backward: 0.069, update: 2.289\n",
      "[proc 0][Train](312/100000) average pos_loss: 0.6728489398956299\n",
      "[proc 0][Train](312/100000) average neg_loss: 0.5921896696090698\n",
      "[proc 0][Train](312/100000) average loss: 0.6325193047523499\n",
      "[proc 0][Train](312/100000) average regularization: 6.467634648288367e-06\n",
      "[proc 0][Train] 1 steps take 2.732 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.439, backward: 0.069, update: 2.222\n",
      "[proc 1][Train](312/100000) average pos_loss: 0.6764215230941772\n",
      "[proc 1][Train](312/100000) average neg_loss: 0.6242344975471497\n",
      "[proc 1][Train](312/100000) average loss: 0.6503280401229858\n",
      "[proc 1][Train](312/100000) average regularization: 6.786912763345754e-06\n",
      "[proc 1][Train] 1 steps take 2.750 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.442, backward: 0.070, update: 2.236\n",
      "[proc 0][Train](313/100000) average pos_loss: 0.6750561594963074\n",
      "[proc 0][Train](313/100000) average neg_loss: 0.3100782334804535\n",
      "[proc 0][Train](313/100000) average loss: 0.49256718158721924\n",
      "[proc 0][Train](313/100000) average regularization: 6.899172149132937e-06\n",
      "[proc 0][Train] 1 steps take 2.681 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.436, backward: 0.069, update: 2.174\n",
      "[proc 1][Train](313/100000) average pos_loss: 0.6461103558540344\n",
      "[proc 1][Train](313/100000) average neg_loss: 0.3080752193927765\n",
      "[proc 1][Train](313/100000) average loss: 0.47709280252456665\n",
      "[proc 1][Train](313/100000) average regularization: 6.851848866062937e-06\n",
      "[proc 1][Train] 1 steps take 2.834 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.474, backward: 0.070, update: 2.289\n",
      "[proc 0][Train](314/100000) average pos_loss: 0.684977650642395\n",
      "[proc 0][Train](314/100000) average neg_loss: 0.5479159951210022\n",
      "[proc 0][Train](314/100000) average loss: 0.616446852684021\n",
      "[proc 0][Train](314/100000) average regularization: 6.688198027404724e-06\n",
      "[proc 0][Train] 1 steps take 2.715 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.430, backward: 0.070, update: 2.213\n",
      "[proc 1][Train](314/100000) average pos_loss: 0.6724000573158264\n",
      "[proc 1][Train](314/100000) average neg_loss: 0.5366710424423218\n",
      "[proc 1][Train](314/100000) average loss: 0.6045355796813965\n",
      "[proc 1][Train](314/100000) average regularization: 7.1309264058072586e-06\n",
      "[proc 1][Train] 1 steps take 2.767 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.447, backward: 0.069, update: 2.249\n",
      "[proc 0][Train](315/100000) average pos_loss: 0.6903836131095886\n",
      "[proc 0][Train](315/100000) average neg_loss: 0.29575517773628235\n",
      "[proc 0][Train](315/100000) average loss: 0.4930694103240967\n",
      "[proc 0][Train](315/100000) average regularization: 6.92657249601325e-06\n",
      "[proc 0][Train] 1 steps take 2.645 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.420, backward: 0.070, update: 2.153\n",
      "[proc 1][Train](315/100000) average pos_loss: 0.6645081043243408\n",
      "[proc 1][Train](315/100000) average neg_loss: 0.29654377698898315\n",
      "[proc 1][Train](315/100000) average loss: 0.480525940656662\n",
      "[proc 1][Train](315/100000) average regularization: 6.7833734647138044e-06\n",
      "[proc 1][Train] 1 steps take 2.746 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.444, backward: 0.070, update: 2.230\n",
      "[proc 0][Train](316/100000) average pos_loss: 0.6874163150787354\n",
      "[proc 0][Train](316/100000) average neg_loss: 0.6042215824127197\n",
      "[proc 0][Train](316/100000) average loss: 0.6458189487457275\n",
      "[proc 0][Train](316/100000) average regularization: 6.937058515177341e-06\n",
      "[proc 0][Train] 1 steps take 2.717 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.434, backward: 0.069, update: 2.213\n",
      "[proc 1][Train](316/100000) average pos_loss: 0.6917383074760437\n",
      "[proc 1][Train](316/100000) average neg_loss: 0.5788778066635132\n",
      "[proc 1][Train](316/100000) average loss: 0.635308027267456\n",
      "[proc 1][Train](316/100000) average regularization: 6.691422186122509e-06\n",
      "[proc 1][Train] 1 steps take 2.733 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.440, backward: 0.070, update: 2.221\n",
      "[proc 0][Train](317/100000) average pos_loss: 0.6486314535140991\n",
      "[proc 0][Train](317/100000) average neg_loss: 0.29839104413986206\n",
      "[proc 0][Train](317/100000) average loss: 0.4735112488269806\n",
      "[proc 0][Train](317/100000) average regularization: 7.011965863057412e-06\n",
      "[proc 0][Train] 1 steps take 2.717 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.433, backward: 0.070, update: 2.212\n",
      "[proc 1][Train](317/100000) average pos_loss: 0.6516019701957703\n",
      "[proc 1][Train](317/100000) average neg_loss: 0.31651705503463745\n",
      "[proc 1][Train](317/100000) average loss: 0.48405951261520386\n",
      "[proc 1][Train](317/100000) average regularization: 6.864794158900622e-06\n",
      "[proc 1][Train] 1 steps take 2.728 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.439, backward: 0.070, update: 2.217\n",
      "[proc 0][Train](318/100000) average pos_loss: 0.6602959632873535\n",
      "[proc 0][Train](318/100000) average neg_loss: 0.5779895782470703\n",
      "[proc 0][Train](318/100000) average loss: 0.6191427707672119\n",
      "[proc 0][Train](318/100000) average regularization: 6.9302122938097455e-06\n",
      "[proc 0][Train] 1 steps take 2.666 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.428, backward: 0.070, update: 2.167\n",
      "[proc 1][Train](318/100000) average pos_loss: 0.6374998092651367\n",
      "[proc 1][Train](318/100000) average neg_loss: 0.5845672488212585\n",
      "[proc 1][Train](318/100000) average loss: 0.61103355884552\n",
      "[proc 1][Train](318/100000) average regularization: 6.6984766817768104e-06\n",
      "[proc 1][Train] 1 steps take 2.986 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.426, backward: 0.070, update: 2.488\n",
      "[proc 0][Train](319/100000) average pos_loss: 0.7041311264038086\n",
      "[proc 0][Train](319/100000) average neg_loss: 0.29462361335754395\n",
      "[proc 0][Train](319/100000) average loss: 0.49937736988067627\n",
      "[proc 0][Train](319/100000) average regularization: 6.734535418218002e-06\n",
      "[proc 0][Train] 1 steps take 2.541 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.423, backward: 0.070, update: 2.046\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 1][Train](319/100000) average pos_loss: 0.6891223192214966\n",
      "[proc 1][Train](319/100000) average neg_loss: 0.2946910262107849\n",
      "[proc 1][Train](319/100000) average loss: 0.49190667271614075\n",
      "[proc 1][Train](319/100000) average regularization: 6.973408744670451e-06\n",
      "[proc 1][Train] 1 steps take 2.674 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.431, backward: 0.070, update: 2.171\n",
      "[proc 0][Train](320/100000) average pos_loss: 0.6582446694374084\n",
      "[proc 0][Train](320/100000) average neg_loss: 0.5468236207962036\n",
      "[proc 0][Train](320/100000) average loss: 0.6025341749191284\n",
      "[proc 0][Train](320/100000) average regularization: 6.891878456372069e-06\n",
      "[proc 0][Train] 1 steps take 2.706 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.431, backward: 0.070, update: 2.204\n",
      "[proc 1][Train](320/100000) average pos_loss: 0.6810446381568909\n",
      "[proc 1][Train](320/100000) average neg_loss: 0.5204712152481079\n",
      "[proc 1][Train](320/100000) average loss: 0.6007579565048218\n",
      "[proc 1][Train](320/100000) average regularization: 6.812045285187196e-06\n",
      "[proc 1][Train] 1 steps take 2.556 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.425, backward: 0.071, update: 2.059\n",
      "[proc 0][Train](321/100000) average pos_loss: 0.6469972133636475\n",
      "[proc 0][Train](321/100000) average neg_loss: 0.2846387028694153\n",
      "[proc 0][Train](321/100000) average loss: 0.46581795811653137\n",
      "[proc 0][Train](321/100000) average regularization: 6.805879365856526e-06\n",
      "[proc 0][Train] 1 steps take 2.803 seconds\n",
      "[proc 0]sample: 0.018, forward: 0.438, backward: 0.069, update: 2.279\n",
      "[proc 1][Train](321/100000) average pos_loss: 0.6517190933227539\n",
      "[proc 1][Train](321/100000) average neg_loss: 0.3164849877357483\n",
      "[proc 1][Train](321/100000) average loss: 0.4841020405292511\n",
      "[proc 1][Train](321/100000) average regularization: 6.901017059135484e-06\n",
      "[proc 1][Train] 1 steps take 2.605 seconds\n",
      "[proc 1]sample: 0.019, forward: 0.409, backward: 0.069, update: 2.108\n",
      "[proc 0][Train](322/100000) average pos_loss: 0.6527529954910278\n",
      "[proc 0][Train](322/100000) average neg_loss: 0.539576530456543\n",
      "[proc 0][Train](322/100000) average loss: 0.5961647629737854\n",
      "[proc 0][Train](322/100000) average regularization: 7.026058938208735e-06\n",
      "[proc 0][Train] 1 steps take 2.634 seconds\n",
      "[proc 0]sample: 0.020, forward: 0.436, backward: 0.070, update: 2.108\n",
      "[proc 1][Train](322/100000) average pos_loss: 0.6980799436569214\n",
      "[proc 1][Train](322/100000) average neg_loss: 0.6073040962219238\n",
      "[proc 1][Train](322/100000) average loss: 0.6526920199394226\n",
      "[proc 1][Train](322/100000) average regularization: 6.9981065280444454e-06\n",
      "[proc 1][Train] 1 steps take 2.679 seconds\n",
      "[proc 1]sample: 0.014, forward: 0.436, backward: 0.070, update: 2.159\n",
      "[proc 0][Train](323/100000) average pos_loss: 0.6898900270462036\n",
      "[proc 0][Train](323/100000) average neg_loss: 0.29646801948547363\n",
      "[proc 0][Train](323/100000) average loss: 0.4931790232658386\n",
      "[proc 0][Train](323/100000) average regularization: 6.757433311577188e-06\n",
      "[proc 0][Train] 1 steps take 2.641 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.442, backward: 0.070, update: 2.128\n",
      "[proc 1][Train](323/100000) average pos_loss: 0.6499120593070984\n",
      "[proc 1][Train](323/100000) average neg_loss: 0.29328763484954834\n",
      "[proc 1][Train](323/100000) average loss: 0.47159984707832336\n",
      "[proc 1][Train](323/100000) average regularization: 7.107061264832737e-06\n",
      "[proc 1][Train] 1 steps take 2.679 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.435, backward: 0.070, update: 2.173\n",
      "[proc 0][Train](324/100000) average pos_loss: 0.6618006229400635\n",
      "[proc 0][Train](324/100000) average neg_loss: 0.5826206207275391\n",
      "[proc 0][Train](324/100000) average loss: 0.6222106218338013\n",
      "[proc 0][Train](324/100000) average regularization: 6.7336709435039666e-06\n",
      "[proc 0][Train] 1 steps take 2.654 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.436, backward: 0.069, update: 2.148\n",
      "[proc 1][Train](324/100000) average pos_loss: 0.6417238712310791\n",
      "[proc 1][Train](324/100000) average neg_loss: 0.5741188526153564\n",
      "[proc 1][Train](324/100000) average loss: 0.6079213619232178\n",
      "[proc 1][Train](324/100000) average regularization: 6.964144176890841e-06\n",
      "[proc 1][Train] 1 steps take 2.666 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.424, backward: 0.070, update: 2.171\n",
      "[proc 0][Train](325/100000) average pos_loss: 0.6945275068283081\n",
      "[proc 0][Train](325/100000) average neg_loss: 0.30733466148376465\n",
      "[proc 0][Train](325/100000) average loss: 0.5009310841560364\n",
      "[proc 0][Train](325/100000) average regularization: 6.550068064825609e-06\n",
      "[proc 0][Train] 1 steps take 2.745 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.494, backward: 0.069, update: 2.180\n",
      "[proc 1][Train](325/100000) average pos_loss: 0.6943830847740173\n",
      "[proc 1][Train](325/100000) average neg_loss: 0.3410766124725342\n",
      "[proc 1][Train](325/100000) average loss: 0.5177298784255981\n",
      "[proc 1][Train](325/100000) average regularization: 6.6845459514297545e-06\n",
      "[proc 1][Train] 1 steps take 2.640 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.426, backward: 0.070, update: 2.142\n",
      "[proc 0][Train](326/100000) average pos_loss: 0.6715495586395264\n",
      "[proc 0][Train](326/100000) average neg_loss: 0.566729724407196\n",
      "[proc 0][Train](326/100000) average loss: 0.6191396713256836\n",
      "[proc 0][Train](326/100000) average regularization: 7.119064321159385e-06\n",
      "[proc 0][Train] 1 steps take 2.661 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.438, backward: 0.070, update: 2.152\n",
      "[proc 1][Train](326/100000) average pos_loss: 0.6684216856956482\n",
      "[proc 1][Train](326/100000) average neg_loss: 0.5876927971839905\n",
      "[proc 1][Train](326/100000) average loss: 0.6280572414398193\n",
      "[proc 1][Train](326/100000) average regularization: 7.008457941992674e-06\n",
      "[proc 1][Train] 1 steps take 2.672 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.437, backward: 0.070, update: 2.164\n",
      "[proc 0][Train](327/100000) average pos_loss: 0.6987835764884949\n",
      "[proc 0][Train](327/100000) average neg_loss: 0.3055959641933441\n",
      "[proc 0][Train](327/100000) average loss: 0.5021897554397583\n",
      "[proc 0][Train](327/100000) average regularization: 6.908932391525013e-06\n",
      "[proc 0][Train] 1 steps take 2.653 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.442, backward: 0.070, update: 2.140\n",
      "[proc 1][Train](327/100000) average pos_loss: 0.6779769659042358\n",
      "[proc 1][Train](327/100000) average neg_loss: 0.30190688371658325\n",
      "[proc 1][Train](327/100000) average loss: 0.48994192481040955\n",
      "[proc 1][Train](327/100000) average regularization: 6.85220402374398e-06\n",
      "[proc 1][Train] 1 steps take 2.652 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.425, backward: 0.071, update: 2.154\n",
      "[proc 0][Train](328/100000) average pos_loss: 0.6989929676055908\n",
      "[proc 0][Train](328/100000) average neg_loss: 0.6005139350891113\n",
      "[proc 0][Train](328/100000) average loss: 0.6497534513473511\n",
      "[proc 0][Train](328/100000) average regularization: 6.810042577853892e-06\n",
      "[proc 0][Train] 1 steps take 2.712 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.436, backward: 0.070, update: 2.204\n",
      "[proc 1][Train](328/100000) average pos_loss: 0.6680530309677124\n",
      "[proc 1][Train](328/100000) average neg_loss: 0.5952844619750977\n",
      "[proc 1][Train](328/100000) average loss: 0.631668746471405\n",
      "[proc 1][Train](328/100000) average regularization: 6.918540293554543e-06\n",
      "[proc 1][Train] 1 steps take 2.663 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.398, backward: 0.070, update: 2.194\n",
      "[proc 0][Train](329/100000) average pos_loss: 0.6591879725456238\n",
      "[proc 0][Train](329/100000) average neg_loss: 0.2881985902786255\n",
      "[proc 0][Train](329/100000) average loss: 0.47369328141212463\n",
      "[proc 0][Train](329/100000) average regularization: 7.170425305957906e-06\n",
      "[proc 0][Train] 1 steps take 2.695 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.432, backward: 0.069, update: 2.192\n",
      "[proc 1][Train](329/100000) average pos_loss: 0.6167776584625244\n",
      "[proc 1][Train](329/100000) average neg_loss: 0.32475775480270386\n",
      "[proc 1][Train](329/100000) average loss: 0.47076770663261414\n",
      "[proc 1][Train](329/100000) average regularization: 6.880066848680144e-06\n",
      "[proc 1][Train] 1 steps take 2.656 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.446, backward: 0.070, update: 2.138\n",
      "[proc 0][Train](330/100000) average pos_loss: 0.656097412109375\n",
      "[proc 0][Train](330/100000) average neg_loss: 0.5598143935203552\n",
      "[proc 0][Train](330/100000) average loss: 0.6079559326171875\n",
      "[proc 0][Train](330/100000) average regularization: 6.84893666402786e-06\n",
      "[proc 0][Train] 1 steps take 2.653 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.405, backward: 0.070, update: 2.178\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 1][Train](330/100000) average pos_loss: 0.651321530342102\n",
      "[proc 1][Train](330/100000) average neg_loss: 0.5750472545623779\n",
      "[proc 1][Train](330/100000) average loss: 0.61318439245224\n",
      "[proc 1][Train](330/100000) average regularization: 7.134442967071664e-06\n",
      "[proc 1][Train] 1 steps take 2.679 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.425, backward: 0.069, update: 2.183\n",
      "[proc 0][Train](331/100000) average pos_loss: 0.6400346159934998\n",
      "[proc 0][Train](331/100000) average neg_loss: 0.2744755744934082\n",
      "[proc 0][Train](331/100000) average loss: 0.457255095243454\n",
      "[proc 0][Train](331/100000) average regularization: 7.078707767504966e-06\n",
      "[proc 0][Train] 1 steps take 2.686 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.431, backward: 0.069, update: 2.184\n",
      "[proc 1][Train](331/100000) average pos_loss: 0.6649390459060669\n",
      "[proc 1][Train](331/100000) average neg_loss: 0.3178738057613373\n",
      "[proc 1][Train](331/100000) average loss: 0.4914064407348633\n",
      "[proc 1][Train](331/100000) average regularization: 7.0265641625155695e-06\n",
      "[proc 1][Train] 1 steps take 2.650 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.415, backward: 0.070, update: 2.163\n",
      "[proc 0][Train](332/100000) average pos_loss: 0.6490097045898438\n",
      "[proc 0][Train](332/100000) average neg_loss: 0.5048115253448486\n",
      "[proc 0][Train](332/100000) average loss: 0.5769106149673462\n",
      "[proc 0][Train](332/100000) average regularization: 6.785818641219521e-06\n",
      "[proc 0][Train] 1 steps take 2.660 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.425, backward: 0.070, update: 2.164\n",
      "[proc 1][Train](332/100000) average pos_loss: 0.6408572196960449\n",
      "[proc 1][Train](332/100000) average neg_loss: 0.5682437419891357\n",
      "[proc 1][Train](332/100000) average loss: 0.6045504808425903\n",
      "[proc 1][Train](332/100000) average regularization: 7.257759989443002e-06\n",
      "[proc 1][Train] 1 steps take 2.693 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.464, backward: 0.070, update: 2.158\n",
      "[proc 0][Train](333/100000) average pos_loss: 0.6916584968566895\n",
      "[proc 0][Train](333/100000) average neg_loss: 0.3096446394920349\n",
      "[proc 0][Train](333/100000) average loss: 0.5006515979766846\n",
      "[proc 0][Train](333/100000) average regularization: 6.897804269101471e-06\n",
      "[proc 0][Train] 1 steps take 2.608 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.425, backward: 0.069, update: 2.112\n",
      "[proc 1][Train](333/100000) average pos_loss: 0.6683593988418579\n",
      "[proc 1][Train](333/100000) average neg_loss: 0.31903350353240967\n",
      "[proc 1][Train](333/100000) average loss: 0.4936964511871338\n",
      "[proc 1][Train](333/100000) average regularization: 6.86931753079989e-06\n",
      "[proc 1][Train] 1 steps take 2.679 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.453, backward: 0.070, update: 2.154\n",
      "[proc 0][Train](334/100000) average pos_loss: 0.6817935705184937\n",
      "[proc 0][Train](334/100000) average neg_loss: 0.5914111733436584\n",
      "[proc 0][Train](334/100000) average loss: 0.6366024017333984\n",
      "[proc 0][Train](334/100000) average regularization: 6.9480156525969505e-06\n",
      "[proc 0][Train] 1 steps take 2.643 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.428, backward: 0.069, update: 2.145\n",
      "[proc 1][Train](334/100000) average pos_loss: 0.6427872180938721\n",
      "[proc 1][Train](334/100000) average neg_loss: 0.5572970509529114\n",
      "[proc 1][Train](334/100000) average loss: 0.6000421047210693\n",
      "[proc 1][Train](334/100000) average regularization: 6.825495802331716e-06\n",
      "[proc 1][Train] 1 steps take 2.644 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.428, backward: 0.070, update: 2.145\n",
      "[proc 0][Train](335/100000) average pos_loss: 0.6710262894630432\n",
      "[proc 0][Train](335/100000) average neg_loss: 0.2896977961063385\n",
      "[proc 0][Train](335/100000) average loss: 0.48036205768585205\n",
      "[proc 0][Train](335/100000) average regularization: 7.3852561399689876e-06\n",
      "[proc 0][Train] 1 steps take 2.643 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.427, backward: 0.070, update: 2.145\n",
      "[proc 1][Train](335/100000) average pos_loss: 0.7058236598968506\n",
      "[proc 1][Train](335/100000) average neg_loss: 0.3046882152557373\n",
      "[proc 1][Train](335/100000) average loss: 0.505255937576294\n",
      "[proc 1][Train](335/100000) average regularization: 7.0533278631046414e-06\n",
      "[proc 1][Train] 1 steps take 2.679 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.402, backward: 0.070, update: 2.206\n",
      "[proc 0][Train](336/100000) average pos_loss: 0.6522629261016846\n",
      "[proc 0][Train](336/100000) average neg_loss: 0.570999026298523\n",
      "[proc 0][Train](336/100000) average loss: 0.6116309762001038\n",
      "[proc 0][Train](336/100000) average regularization: 7.093271506164456e-06\n",
      "[proc 0][Train] 1 steps take 2.673 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.435, backward: 0.070, update: 2.167\n",
      "[proc 1][Train](336/100000) average pos_loss: 0.6813994646072388\n",
      "[proc 1][Train](336/100000) average neg_loss: 0.5532378554344177\n",
      "[proc 1][Train](336/100000) average loss: 0.6173186302185059\n",
      "[proc 1][Train](336/100000) average regularization: 7.137568900361657e-06\n",
      "[proc 1][Train] 1 steps take 2.710 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.431, backward: 0.070, update: 2.207\n",
      "[proc 0][Train](337/100000) average pos_loss: 0.6741862893104553\n",
      "[proc 0][Train](337/100000) average neg_loss: 0.30825695395469666\n",
      "[proc 0][Train](337/100000) average loss: 0.4912216067314148\n",
      "[proc 0][Train](337/100000) average regularization: 6.9754801188537385e-06\n",
      "[proc 0][Train] 1 steps take 2.666 seconds\n",
      "[proc 0]sample: 0.016, forward: 0.437, backward: 0.071, update: 2.143\n",
      "[proc 1][Train](337/100000) average pos_loss: 0.651365339756012\n",
      "[proc 1][Train](337/100000) average neg_loss: 0.2999629080295563\n",
      "[proc 1][Train](337/100000) average loss: 0.4756641387939453\n",
      "[proc 1][Train](337/100000) average regularization: 6.95696189723094e-06\n",
      "[proc 1][Train] 1 steps take 2.632 seconds\n",
      "[proc 1]sample: 0.015, forward: 0.420, backward: 0.070, update: 2.127\n",
      "[proc 0][Train](338/100000) average pos_loss: 0.647944986820221\n",
      "[proc 0][Train](338/100000) average neg_loss: 0.577804446220398\n",
      "[proc 0][Train](338/100000) average loss: 0.6128747463226318\n",
      "[proc 0][Train](338/100000) average regularization: 6.991017926338827e-06\n",
      "[proc 0][Train] 1 steps take 2.623 seconds\n",
      "[proc 0]sample: 0.015, forward: 0.433, backward: 0.069, update: 2.106\n",
      "[proc 1][Train](338/100000) average pos_loss: 0.6921975016593933\n",
      "[proc 1][Train](338/100000) average neg_loss: 0.5625424385070801\n",
      "[proc 1][Train](338/100000) average loss: 0.6273699998855591\n",
      "[proc 1][Train](338/100000) average regularization: 7.216564881673548e-06\n",
      "[proc 1][Train] 1 steps take 2.654 seconds\n",
      "[proc 1]sample: 0.013, forward: 0.407, backward: 0.070, update: 2.164\n",
      "[proc 0][Train](339/100000) average pos_loss: 0.6767305135726929\n",
      "[proc 0][Train](339/100000) average neg_loss: 0.2929747998714447\n",
      "[proc 0][Train](339/100000) average loss: 0.48485267162323\n",
      "[proc 0][Train](339/100000) average regularization: 7.3303954195580445e-06\n",
      "[proc 0][Train] 1 steps take 2.625 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.432, backward: 0.069, update: 2.122\n",
      "[proc 1][Train](339/100000) average pos_loss: 0.6623113751411438\n",
      "[proc 1][Train](339/100000) average neg_loss: 0.30793583393096924\n",
      "[proc 1][Train](339/100000) average loss: 0.4851236045360565\n",
      "[proc 1][Train](339/100000) average regularization: 6.975984433665872e-06\n",
      "[proc 1][Train] 1 steps take 2.653 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.418, backward: 0.070, update: 2.163\n",
      "[proc 0][Train](340/100000) average pos_loss: 0.6275100111961365\n",
      "[proc 0][Train](340/100000) average neg_loss: 0.5696207284927368\n",
      "[proc 0][Train](340/100000) average loss: 0.5985653400421143\n",
      "[proc 0][Train](340/100000) average regularization: 7.228407412185334e-06\n",
      "[proc 0][Train] 1 steps take 2.601 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.432, backward: 0.070, update: 2.097\n",
      "[proc 1][Train](340/100000) average pos_loss: 0.6271259784698486\n",
      "[proc 1][Train](340/100000) average neg_loss: 0.5811694860458374\n",
      "[proc 1][Train](340/100000) average loss: 0.604147732257843\n",
      "[proc 1][Train](340/100000) average regularization: 7.18638148100581e-06\n",
      "[proc 1][Train] 1 steps take 2.611 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.423, backward: 0.070, update: 2.117\n",
      "[proc 0][Train](341/100000) average pos_loss: 0.6886945962905884\n",
      "[proc 0][Train](341/100000) average neg_loss: 0.3124920725822449\n",
      "[proc 0][Train](341/100000) average loss: 0.5005933046340942\n",
      "[proc 0][Train](341/100000) average regularization: 6.99962220096495e-06\n",
      "[proc 0][Train] 1 steps take 2.589 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.433, backward: 0.070, update: 2.084\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 1][Train](341/100000) average pos_loss: 0.6734422445297241\n",
      "[proc 1][Train](341/100000) average neg_loss: 0.28627800941467285\n",
      "[proc 1][Train](341/100000) average loss: 0.4798601269721985\n",
      "[proc 1][Train](341/100000) average regularization: 7.113193532859441e-06\n",
      "[proc 1][Train] 1 steps take 2.689 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.431, backward: 0.070, update: 2.187\n",
      "[proc 0][Train](342/100000) average pos_loss: 0.6727498769760132\n",
      "[proc 0][Train](342/100000) average neg_loss: 0.5751352310180664\n",
      "[proc 0][Train](342/100000) average loss: 0.6239425539970398\n",
      "[proc 0][Train](342/100000) average regularization: 6.991429017944029e-06\n",
      "[proc 0][Train] 1 steps take 2.655 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.443, backward: 0.070, update: 2.141\n",
      "[proc 1][Train](342/100000) average pos_loss: 0.6501933336257935\n",
      "[proc 1][Train](342/100000) average neg_loss: 0.5584733486175537\n",
      "[proc 1][Train](342/100000) average loss: 0.6043333411216736\n",
      "[proc 1][Train](342/100000) average regularization: 7.086973710102029e-06\n",
      "[proc 1][Train] 1 steps take 2.695 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.431, backward: 0.069, update: 2.193\n",
      "[proc 0][Train](343/100000) average pos_loss: 0.6677976846694946\n",
      "[proc 0][Train](343/100000) average neg_loss: 0.3077143728733063\n",
      "[proc 0][Train](343/100000) average loss: 0.48775601387023926\n",
      "[proc 0][Train](343/100000) average regularization: 7.1387171374226455e-06\n",
      "[proc 0][Train] 1 steps take 2.725 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.436, backward: 0.069, update: 2.219\n",
      "[proc 1][Train](343/100000) average pos_loss: 0.6542307734489441\n",
      "[proc 1][Train](343/100000) average neg_loss: 0.28848499059677124\n",
      "[proc 1][Train](343/100000) average loss: 0.47135788202285767\n",
      "[proc 1][Train](343/100000) average regularization: 7.059923063934548e-06\n",
      "[proc 1][Train] 1 steps take 2.636 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.427, backward: 0.069, update: 2.138\n",
      "[proc 0][Train](344/100000) average pos_loss: 0.6718372106552124\n",
      "[proc 0][Train](344/100000) average neg_loss: 0.5845671892166138\n",
      "[proc 0][Train](344/100000) average loss: 0.6282021999359131\n",
      "[proc 0][Train](344/100000) average regularization: 7.12530982127646e-06\n",
      "[proc 0][Train] 1 steps take 2.674 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.441, backward: 0.070, update: 2.162\n",
      "[proc 1][Train](344/100000) average pos_loss: 0.6305818557739258\n",
      "[proc 1][Train](344/100000) average neg_loss: 0.5647388696670532\n",
      "[proc 1][Train](344/100000) average loss: 0.5976603627204895\n",
      "[proc 1][Train](344/100000) average regularization: 7.148941222112626e-06\n",
      "[proc 1][Train] 1 steps take 2.646 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.421, backward: 0.069, update: 2.154\n",
      "[proc 0][Train](345/100000) average pos_loss: 0.6503119468688965\n",
      "[proc 0][Train](345/100000) average neg_loss: 0.2819392681121826\n",
      "[proc 0][Train](345/100000) average loss: 0.46612560749053955\n",
      "[proc 0][Train](345/100000) average regularization: 7.20418711352977e-06\n",
      "[proc 0][Train] 1 steps take 2.647 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.433, backward: 0.069, update: 2.142\n",
      "[proc 1][Train](345/100000) average pos_loss: 0.6804954409599304\n",
      "[proc 1][Train](345/100000) average neg_loss: 0.26276257634162903\n",
      "[proc 1][Train](345/100000) average loss: 0.4716290235519409\n",
      "[proc 1][Train](345/100000) average regularization: 7.046978225844214e-06\n",
      "[proc 1][Train] 1 steps take 2.614 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.420, backward: 0.071, update: 2.121\n",
      "[proc 0][Train](346/100000) average pos_loss: 0.6154581904411316\n",
      "[proc 0][Train](346/100000) average neg_loss: 0.5974873304367065\n",
      "[proc 0][Train](346/100000) average loss: 0.6064727306365967\n",
      "[proc 0][Train](346/100000) average regularization: 6.9398047344293445e-06\n",
      "[proc 0][Train] 1 steps take 2.629 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.437, backward: 0.069, update: 2.122\n",
      "[proc 1][Train](346/100000) average pos_loss: 0.6662055253982544\n",
      "[proc 1][Train](346/100000) average neg_loss: 0.5732760429382324\n",
      "[proc 1][Train](346/100000) average loss: 0.6197407841682434\n",
      "[proc 1][Train](346/100000) average regularization: 7.105066742951749e-06\n",
      "[proc 1][Train] 1 steps take 2.621 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.426, backward: 0.070, update: 2.124\n",
      "[proc 0][Train](347/100000) average pos_loss: 0.7002251744270325\n",
      "[proc 0][Train](347/100000) average neg_loss: 0.2956632673740387\n",
      "[proc 0][Train](347/100000) average loss: 0.4979442358016968\n",
      "[proc 0][Train](347/100000) average regularization: 6.918837698322022e-06\n",
      "[proc 0][Train] 1 steps take 2.708 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.429, backward: 0.069, update: 2.208\n",
      "[proc 1][Train](347/100000) average pos_loss: 0.6268284320831299\n",
      "[proc 1][Train](347/100000) average neg_loss: 0.283110111951828\n",
      "[proc 1][Train](347/100000) average loss: 0.45496928691864014\n",
      "[proc 1][Train](347/100000) average regularization: 6.9197785705910064e-06\n",
      "[proc 1][Train] 1 steps take 3.256 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.409, backward: 0.069, update: 2.776\n",
      "[proc 0][Train](348/100000) average pos_loss: 0.6178751587867737\n",
      "[proc 0][Train](348/100000) average neg_loss: 0.5419983863830566\n",
      "[proc 0][Train](348/100000) average loss: 0.5799367427825928\n",
      "[proc 0][Train](348/100000) average regularization: 7.095847195159877e-06\n",
      "[proc 0][Train] 1 steps take 2.630 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.409, backward: 0.070, update: 2.150\n",
      "[proc 0][Train](349/100000) average pos_loss: 0.6321281790733337\n",
      "[proc 0][Train](349/100000) average neg_loss: 0.29170656204223633\n",
      "[proc 0][Train](349/100000) average loss: 0.46191737055778503\n",
      "[proc 0][Train](349/100000) average regularization: 6.972403753024992e-06\n",
      "[proc 0][Train] 1 steps take 2.734 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.486, backward: 0.069, update: 2.177\n",
      "[proc 1][Train](348/100000) average pos_loss: 0.6367111802101135\n",
      "[proc 1][Train](348/100000) average neg_loss: 0.5354288816452026\n",
      "[proc 1][Train](348/100000) average loss: 0.5860700607299805\n",
      "[proc 1][Train](348/100000) average regularization: 7.118339453882072e-06\n",
      "[proc 1][Train] 1 steps take 2.932 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.538, backward: 0.073, update: 2.319\n",
      "[proc 0][Train](350/100000) average pos_loss: 0.6592421531677246\n",
      "[proc 0][Train](350/100000) average neg_loss: 0.5754498243331909\n",
      "[proc 0][Train](350/100000) average loss: 0.6173459887504578\n",
      "[proc 0][Train](350/100000) average regularization: 6.918114650034113e-06\n",
      "[proc 0][Train] 1 steps take 2.831 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.487, backward: 0.069, update: 2.274\n",
      "[proc 1][Train](349/100000) average pos_loss: 0.6316303014755249\n",
      "[proc 1][Train](349/100000) average neg_loss: 0.3078424334526062\n",
      "[proc 1][Train](349/100000) average loss: 0.46973636746406555\n",
      "[proc 1][Train](349/100000) average regularization: 6.830113306932617e-06\n",
      "[proc 1][Train] 1 steps take 2.757 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.456, backward: 0.071, update: 2.228\n",
      "[proc 1][Train](350/100000) average pos_loss: 0.6224009990692139\n",
      "[proc 1][Train](350/100000) average neg_loss: 0.5920170545578003\n",
      "[proc 1][Train](350/100000) average loss: 0.6072090268135071\n",
      "[proc 1][Train](350/100000) average regularization: 7.073114829836413e-06\n",
      "[proc 1][Train] 1 steps take 2.854 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.493, backward: 0.070, update: 2.290\n",
      "[proc 0][Train](351/100000) average pos_loss: 0.6349443197250366\n",
      "[proc 0][Train](351/100000) average neg_loss: 0.30672961473464966\n",
      "[proc 0][Train](351/100000) average loss: 0.47083696722984314\n",
      "[proc 0][Train](351/100000) average regularization: 7.022952559054829e-06\n",
      "[proc 0][Train] 1 steps take 3.025 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.525, backward: 0.071, update: 2.427\n",
      "[proc 1][Train](351/100000) average pos_loss: 0.6499710083007812\n",
      "[proc 1][Train](351/100000) average neg_loss: 0.3149818778038025\n",
      "[proc 1][Train](351/100000) average loss: 0.48247644305229187\n",
      "[proc 1][Train](351/100000) average regularization: 6.880100954731461e-06\n",
      "[proc 1][Train] 1 steps take 2.799 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.474, backward: 0.069, update: 2.255\n",
      "[proc 0][Train](352/100000) average pos_loss: 0.6410704851150513\n",
      "[proc 0][Train](352/100000) average neg_loss: 0.565665602684021\n",
      "[proc 0][Train](352/100000) average loss: 0.6033680438995361\n",
      "[proc 0][Train](352/100000) average regularization: 6.867726824566489e-06\n",
      "[proc 0][Train] 1 steps take 2.780 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.457, backward: 0.070, update: 2.251\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 1][Train](352/100000) average pos_loss: 0.6267571449279785\n",
      "[proc 1][Train](352/100000) average neg_loss: 0.5724413394927979\n",
      "[proc 1][Train](352/100000) average loss: 0.5995992422103882\n",
      "[proc 1][Train](352/100000) average regularization: 7.0443315962620545e-06\n",
      "[proc 1][Train] 1 steps take 2.865 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.470, backward: 0.070, update: 2.323\n",
      "[proc 0][Train](353/100000) average pos_loss: 0.6505265235900879\n",
      "[proc 0][Train](353/100000) average neg_loss: 0.304477721452713\n",
      "[proc 0][Train](353/100000) average loss: 0.47750210762023926\n",
      "[proc 0][Train](353/100000) average regularization: 6.9922784859954845e-06\n",
      "[proc 0][Train] 1 steps take 2.920 seconds\n",
      "[proc 0]sample: 0.016, forward: 0.456, backward: 0.069, update: 2.380\n",
      "[proc 1][Train](353/100000) average pos_loss: 0.6833118200302124\n",
      "[proc 1][Train](353/100000) average neg_loss: 0.2970021963119507\n",
      "[proc 1][Train](353/100000) average loss: 0.49015700817108154\n",
      "[proc 1][Train](353/100000) average regularization: 7.139084573282162e-06\n",
      "[proc 1][Train] 1 steps take 2.851 seconds\n",
      "[proc 1]sample: 0.020, forward: 0.476, backward: 0.069, update: 2.287\n",
      "[proc 0][Train](354/100000) average pos_loss: 0.6749030947685242\n",
      "[proc 0][Train](354/100000) average neg_loss: 0.5751953721046448\n",
      "[proc 0][Train](354/100000) average loss: 0.6250492334365845\n",
      "[proc 0][Train](354/100000) average regularization: 7.033170277281897e-06\n",
      "[proc 0][Train] 1 steps take 2.813 seconds\n",
      "[proc 0]sample: 0.016, forward: 0.451, backward: 0.069, update: 2.276\n",
      "[proc 1][Train](354/100000) average pos_loss: 0.6421105861663818\n",
      "[proc 1][Train](354/100000) average neg_loss: 0.5297964811325073\n",
      "[proc 1][Train](354/100000) average loss: 0.5859535336494446\n",
      "[proc 1][Train](354/100000) average regularization: 7.136425210774178e-06\n",
      "[proc 1][Train] 1 steps take 2.745 seconds\n",
      "[proc 1]sample: 0.016, forward: 0.472, backward: 0.069, update: 2.188\n",
      "[proc 0][Train](355/100000) average pos_loss: 0.6427217721939087\n",
      "[proc 0][Train](355/100000) average neg_loss: 0.2908337116241455\n",
      "[proc 0][Train](355/100000) average loss: 0.4667777419090271\n",
      "[proc 0][Train](355/100000) average regularization: 6.978006695135264e-06\n",
      "[proc 0][Train] 1 steps take 2.923 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.462, backward: 0.069, update: 2.390\n",
      "[proc 1][Train](355/100000) average pos_loss: 0.6188924908638\n",
      "[proc 1][Train](355/100000) average neg_loss: 0.3390222191810608\n",
      "[proc 1][Train](355/100000) average loss: 0.4789573550224304\n",
      "[proc 1][Train](355/100000) average regularization: 7.155712410167325e-06\n",
      "[proc 1][Train] 1 steps take 2.698 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.417, backward: 0.070, update: 2.210\n",
      "[proc 0][Train](356/100000) average pos_loss: 0.6457855701446533\n",
      "[proc 0][Train](356/100000) average neg_loss: 0.5110259056091309\n",
      "[proc 0][Train](356/100000) average loss: 0.5784057378768921\n",
      "[proc 0][Train](356/100000) average regularization: 7.367325451923534e-06\n",
      "[proc 0][Train] 1 steps take 2.803 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.420, backward: 0.070, update: 2.310\n",
      "[proc 1][Train](356/100000) average pos_loss: 0.6540741324424744\n",
      "[proc 1][Train](356/100000) average neg_loss: 0.5625033378601074\n",
      "[proc 1][Train](356/100000) average loss: 0.6082887649536133\n",
      "[proc 1][Train](356/100000) average regularization: 7.255478976730956e-06\n",
      "[proc 1][Train] 1 steps take 2.721 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.466, backward: 0.069, update: 2.185\n",
      "[proc 0][Train](357/100000) average pos_loss: 0.6221899390220642\n",
      "[proc 0][Train](357/100000) average neg_loss: 0.29911118745803833\n",
      "[proc 0][Train](357/100000) average loss: 0.46065056324005127\n",
      "[proc 0][Train](357/100000) average regularization: 7.126023774617352e-06\n",
      "[proc 0][Train] 1 steps take 2.881 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.434, backward: 0.069, update: 2.377\n",
      "[proc 1][Train](357/100000) average pos_loss: 0.6458539962768555\n",
      "[proc 1][Train](357/100000) average neg_loss: 0.2859620749950409\n",
      "[proc 1][Train](357/100000) average loss: 0.4659080505371094\n",
      "[proc 1][Train](357/100000) average regularization: 7.0202959250309505e-06\n",
      "[proc 1][Train] 1 steps take 2.706 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.426, backward: 0.070, update: 2.208\n",
      "[proc 0][Train](358/100000) average pos_loss: 0.6183530688285828\n",
      "[proc 0][Train](358/100000) average neg_loss: 0.5777726173400879\n",
      "[proc 0][Train](358/100000) average loss: 0.5980628728866577\n",
      "[proc 0][Train](358/100000) average regularization: 7.2133002504415344e-06\n",
      "[proc 0][Train] 1 steps take 2.720 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.465, backward: 0.069, update: 2.184\n",
      "[proc 1][Train](358/100000) average pos_loss: 0.671154797077179\n",
      "[proc 1][Train](358/100000) average neg_loss: 0.5395326614379883\n",
      "[proc 1][Train](358/100000) average loss: 0.6053436994552612\n",
      "[proc 1][Train](358/100000) average regularization: 7.30332794773858e-06\n",
      "[proc 1][Train] 1 steps take 2.811 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.461, backward: 0.070, update: 2.279\n",
      "[proc 0][Train](359/100000) average pos_loss: 0.6261821985244751\n",
      "[proc 0][Train](359/100000) average neg_loss: 0.2959662675857544\n",
      "[proc 0][Train](359/100000) average loss: 0.46107423305511475\n",
      "[proc 0][Train](359/100000) average regularization: 7.110681963240495e-06\n",
      "[proc 0][Train] 1 steps take 2.723 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.440, backward: 0.070, update: 2.212\n",
      "[proc 1][Train](359/100000) average pos_loss: 0.6810902953147888\n",
      "[proc 1][Train](359/100000) average neg_loss: 0.2818525433540344\n",
      "[proc 1][Train](359/100000) average loss: 0.4814714193344116\n",
      "[proc 1][Train](359/100000) average regularization: 7.109455054887803e-06\n",
      "[proc 1][Train] 1 steps take 2.581 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.421, backward: 0.072, update: 2.087\n",
      "[proc 0][Train](360/100000) average pos_loss: 0.6396517753601074\n",
      "[proc 0][Train](360/100000) average neg_loss: 0.5885169506072998\n",
      "[proc 0][Train](360/100000) average loss: 0.6140843629837036\n",
      "[proc 0][Train](360/100000) average regularization: 6.990155270614196e-06\n",
      "[proc 0][Train] 1 steps take 2.814 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.440, backward: 0.069, update: 2.304\n",
      "[proc 1][Train](360/100000) average pos_loss: 0.6262222528457642\n",
      "[proc 1][Train](360/100000) average neg_loss: 0.5806901454925537\n",
      "[proc 1][Train](360/100000) average loss: 0.6034561991691589\n",
      "[proc 1][Train](360/100000) average regularization: 7.0733781285525765e-06\n",
      "[proc 1][Train] 1 steps take 2.603 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.412, backward: 0.069, update: 2.121\n",
      "[proc 0][Train](361/100000) average pos_loss: 0.6853563785552979\n",
      "[proc 0][Train](361/100000) average neg_loss: 0.32080939412117004\n",
      "[proc 0][Train](361/100000) average loss: 0.5030828714370728\n",
      "[proc 0][Train](361/100000) average regularization: 6.993382157816086e-06\n",
      "[proc 0][Train] 1 steps take 2.726 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.443, backward: 0.070, update: 2.210\n",
      "[proc 1][Train](361/100000) average pos_loss: 0.6231986284255981\n",
      "[proc 1][Train](361/100000) average neg_loss: 0.2779332101345062\n",
      "[proc 1][Train](361/100000) average loss: 0.4505659341812134\n",
      "[proc 1][Train](361/100000) average regularization: 7.148124950617785e-06\n",
      "[proc 1][Train] 1 steps take 2.670 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.440, backward: 0.070, update: 2.158\n",
      "[proc 0][Train](362/100000) average pos_loss: 0.6513362526893616\n",
      "[proc 0][Train](362/100000) average neg_loss: 0.5874396562576294\n",
      "[proc 0][Train](362/100000) average loss: 0.6193879842758179\n",
      "[proc 0][Train](362/100000) average regularization: 7.226211437227903e-06\n",
      "[proc 0][Train] 1 steps take 2.585 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.411, backward: 0.070, update: 2.102\n",
      "[proc 1][Train](362/100000) average pos_loss: 0.6170308589935303\n",
      "[proc 1][Train](362/100000) average neg_loss: 0.5529897212982178\n",
      "[proc 1][Train](362/100000) average loss: 0.585010290145874\n",
      "[proc 1][Train](362/100000) average regularization: 7.167733201640658e-06\n",
      "[proc 1][Train] 1 steps take 2.648 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.424, backward: 0.070, update: 2.152\n",
      "[proc 0][Train](363/100000) average pos_loss: 0.633612871170044\n",
      "[proc 0][Train](363/100000) average neg_loss: 0.29595261812210083\n",
      "[proc 0][Train](363/100000) average loss: 0.4647827446460724\n",
      "[proc 0][Train](363/100000) average regularization: 7.215489404188702e-06\n",
      "[proc 0][Train] 1 steps take 2.696 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.436, backward: 0.071, update: 2.187\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 1][Train](363/100000) average pos_loss: 0.6725837588310242\n",
      "[proc 1][Train](363/100000) average neg_loss: 0.28247883915901184\n",
      "[proc 1][Train](363/100000) average loss: 0.4775313138961792\n",
      "[proc 1][Train](363/100000) average regularization: 7.22412187315058e-06\n",
      "[proc 1][Train] 1 steps take 2.642 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.432, backward: 0.070, update: 2.139\n",
      "[proc 0][Train](364/100000) average pos_loss: 0.6172258257865906\n",
      "[proc 0][Train](364/100000) average neg_loss: 0.5630843639373779\n",
      "[proc 0][Train](364/100000) average loss: 0.5901551246643066\n",
      "[proc 0][Train](364/100000) average regularization: 6.816256245656405e-06\n",
      "[proc 0][Train] 1 steps take 2.681 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.433, backward: 0.070, update: 2.176\n",
      "[proc 1][Train](364/100000) average pos_loss: 0.5968049168586731\n",
      "[proc 1][Train](364/100000) average neg_loss: 0.561972975730896\n",
      "[proc 1][Train](364/100000) average loss: 0.5793889760971069\n",
      "[proc 1][Train](364/100000) average regularization: 7.321794328163378e-06\n",
      "[proc 1][Train] 1 steps take 2.685 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.435, backward: 0.070, update: 2.179\n",
      "[proc 0][Train](365/100000) average pos_loss: 0.6704957485198975\n",
      "[proc 0][Train](365/100000) average neg_loss: 0.3260872960090637\n",
      "[proc 0][Train](365/100000) average loss: 0.4982915222644806\n",
      "[proc 0][Train](365/100000) average regularization: 7.349161478487076e-06\n",
      "[proc 0][Train] 1 steps take 2.682 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.434, backward: 0.070, update: 2.176\n",
      "[proc 1][Train](365/100000) average pos_loss: 0.6368319988250732\n",
      "[proc 1][Train](365/100000) average neg_loss: 0.30504685640335083\n",
      "[proc 1][Train](365/100000) average loss: 0.47093942761421204\n",
      "[proc 1][Train](365/100000) average regularization: 6.958123776712455e-06\n",
      "[proc 1][Train] 1 steps take 2.652 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.433, backward: 0.070, update: 2.147\n",
      "[proc 0][Train](366/100000) average pos_loss: 0.6267961263656616\n",
      "[proc 0][Train](366/100000) average neg_loss: 0.5471982955932617\n",
      "[proc 0][Train](366/100000) average loss: 0.5869972109794617\n",
      "[proc 0][Train](366/100000) average regularization: 7.346755410253536e-06\n",
      "[proc 0][Train] 1 steps take 2.642 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.426, backward: 0.069, update: 2.145\n",
      "[proc 1][Train](366/100000) average pos_loss: 0.6329900026321411\n",
      "[proc 1][Train](366/100000) average neg_loss: 0.5492763519287109\n",
      "[proc 1][Train](366/100000) average loss: 0.591133177280426\n",
      "[proc 1][Train](366/100000) average regularization: 7.121940143406391e-06\n",
      "[proc 1][Train] 1 steps take 2.694 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.435, backward: 0.069, update: 2.189\n",
      "[proc 0][Train](367/100000) average pos_loss: 0.6229145526885986\n",
      "[proc 0][Train](367/100000) average neg_loss: 0.2857825756072998\n",
      "[proc 0][Train](367/100000) average loss: 0.4543485641479492\n",
      "[proc 0][Train](367/100000) average regularization: 7.251002443808829e-06\n",
      "[proc 0][Train] 1 steps take 2.669 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.437, backward: 0.070, update: 2.160\n",
      "[proc 1][Train](367/100000) average pos_loss: 0.5928353071212769\n",
      "[proc 1][Train](367/100000) average neg_loss: 0.31067878007888794\n",
      "[proc 1][Train](367/100000) average loss: 0.4517570436000824\n",
      "[proc 1][Train](367/100000) average regularization: 7.044761787255993e-06\n",
      "[proc 1][Train] 1 steps take 2.662 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.437, backward: 0.070, update: 2.154\n",
      "[proc 0][Train](368/100000) average pos_loss: 0.6275472640991211\n",
      "[proc 0][Train](368/100000) average neg_loss: 0.5647525191307068\n",
      "[proc 0][Train](368/100000) average loss: 0.5961499214172363\n",
      "[proc 0][Train](368/100000) average regularization: 7.1428744377044495e-06\n",
      "[proc 0][Train] 1 steps take 2.708 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.433, backward: 0.070, update: 2.203\n",
      "[proc 1][Train](368/100000) average pos_loss: 0.6307255029678345\n",
      "[proc 1][Train](368/100000) average neg_loss: 0.5520015954971313\n",
      "[proc 1][Train](368/100000) average loss: 0.5913635492324829\n",
      "[proc 1][Train](368/100000) average regularization: 7.342152457567863e-06\n",
      "[proc 1][Train] 1 steps take 2.904 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.417, backward: 0.070, update: 2.416\n",
      "[proc 0][Train](369/100000) average pos_loss: 0.6140140295028687\n",
      "[proc 0][Train](369/100000) average neg_loss: 0.2970826029777527\n",
      "[proc 0][Train](369/100000) average loss: 0.45554831624031067\n",
      "[proc 0][Train](369/100000) average regularization: 7.077875579852844e-06\n",
      "[proc 0][Train] 1 steps take 2.849 seconds\n",
      "[proc 0]sample: 0.103, forward: 0.421, backward: 0.069, update: 2.255\n",
      "[proc 1][Train](369/100000) average pos_loss: 0.6027154922485352\n",
      "[proc 1][Train](369/100000) average neg_loss: 0.29542091488838196\n",
      "[proc 1][Train](369/100000) average loss: 0.44906818866729736\n",
      "[proc 1][Train](369/100000) average regularization: 7.379626367765013e-06\n",
      "[proc 1][Train] 1 steps take 2.700 seconds\n",
      "[proc 1]sample: 0.108, forward: 0.415, backward: 0.073, update: 2.103\n",
      "[proc 0][Train](370/100000) average pos_loss: 0.6089894771575928\n",
      "[proc 0][Train](370/100000) average neg_loss: 0.5691105723381042\n",
      "[proc 0][Train](370/100000) average loss: 0.5890500545501709\n",
      "[proc 0][Train](370/100000) average regularization: 7.1566651058674324e-06\n",
      "[proc 0][Train] 1 steps take 2.704 seconds\n",
      "[proc 0]sample: 0.021, forward: 0.427, backward: 0.071, update: 2.186\n",
      "[proc 1][Train](370/100000) average pos_loss: 0.6680763363838196\n",
      "[proc 1][Train](370/100000) average neg_loss: 0.5452028512954712\n",
      "[proc 1][Train](370/100000) average loss: 0.6066396236419678\n",
      "[proc 1][Train](370/100000) average regularization: 7.440394711011322e-06\n",
      "[proc 1][Train] 1 steps take 2.740 seconds\n",
      "[proc 1]sample: 0.017, forward: 0.427, backward: 0.071, update: 2.224\n",
      "[proc 0][Train](371/100000) average pos_loss: 0.6094452142715454\n",
      "[proc 0][Train](371/100000) average neg_loss: 0.299316942691803\n",
      "[proc 0][Train](371/100000) average loss: 0.4543810784816742\n",
      "[proc 0][Train](371/100000) average regularization: 7.27590349924867e-06\n",
      "[proc 0][Train] 1 steps take 2.650 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.429, backward: 0.070, update: 2.149\n",
      "[proc 1][Train](371/100000) average pos_loss: 0.6370249390602112\n",
      "[proc 1][Train](371/100000) average neg_loss: 0.30123913288116455\n",
      "[proc 1][Train](371/100000) average loss: 0.46913203597068787\n",
      "[proc 1][Train](371/100000) average regularization: 7.219293365778867e-06\n",
      "[proc 1][Train] 1 steps take 2.611 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.422, backward: 0.070, update: 2.117\n",
      "[proc 0][Train](372/100000) average pos_loss: 0.6297581791877747\n",
      "[proc 0][Train](372/100000) average neg_loss: 0.5768024325370789\n",
      "[proc 0][Train](372/100000) average loss: 0.6032803058624268\n",
      "[proc 0][Train](372/100000) average regularization: 6.996473985054763e-06\n",
      "[proc 0][Train] 1 steps take 2.662 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.435, backward: 0.070, update: 2.155\n",
      "[proc 1][Train](372/100000) average pos_loss: 0.6603745222091675\n",
      "[proc 1][Train](372/100000) average neg_loss: 0.5710612535476685\n",
      "[proc 1][Train](372/100000) average loss: 0.615717887878418\n",
      "[proc 1][Train](372/100000) average regularization: 7.381997875199886e-06\n",
      "[proc 1][Train] 1 steps take 2.648 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.440, backward: 0.069, update: 2.138\n",
      "[proc 0][Train](373/100000) average pos_loss: 0.6356953382492065\n",
      "[proc 0][Train](373/100000) average neg_loss: 0.3145187199115753\n",
      "[proc 0][Train](373/100000) average loss: 0.47510701417922974\n",
      "[proc 0][Train](373/100000) average regularization: 7.289145742106484e-06\n",
      "[proc 0][Train] 1 steps take 2.667 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.427, backward: 0.070, update: 2.169\n",
      "[proc 1][Train](373/100000) average pos_loss: 0.6348165273666382\n",
      "[proc 1][Train](373/100000) average neg_loss: 0.299245148897171\n",
      "[proc 1][Train](373/100000) average loss: 0.4670308232307434\n",
      "[proc 1][Train](373/100000) average regularization: 7.236327292048372e-06\n",
      "[proc 1][Train] 1 steps take 2.718 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.437, backward: 0.069, update: 2.209\n",
      "[proc 0][Train](374/100000) average pos_loss: 0.6203578114509583\n",
      "[proc 0][Train](374/100000) average neg_loss: 0.5971602201461792\n",
      "[proc 0][Train](374/100000) average loss: 0.6087590456008911\n",
      "[proc 0][Train](374/100000) average regularization: 7.424701834679581e-06\n",
      "[proc 0][Train] 1 steps take 2.644 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.427, backward: 0.070, update: 2.146\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 1][Train](374/100000) average pos_loss: 0.6063521504402161\n",
      "[proc 1][Train](374/100000) average neg_loss: 0.5654639005661011\n",
      "[proc 1][Train](374/100000) average loss: 0.585908055305481\n",
      "[proc 1][Train](374/100000) average regularization: 7.056480626488337e-06\n",
      "[proc 1][Train] 1 steps take 2.723 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.433, backward: 0.069, update: 2.220\n",
      "[proc 0][Train](375/100000) average pos_loss: 0.6789630055427551\n",
      "[proc 0][Train](375/100000) average neg_loss: 0.30679166316986084\n",
      "[proc 0][Train](375/100000) average loss: 0.492877334356308\n",
      "[proc 0][Train](375/100000) average regularization: 7.163875125115737e-06\n",
      "[proc 0][Train] 1 steps take 2.646 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.432, backward: 0.071, update: 2.142\n",
      "[proc 1][Train](375/100000) average pos_loss: 0.6232752799987793\n",
      "[proc 1][Train](375/100000) average neg_loss: 0.29547813534736633\n",
      "[proc 1][Train](375/100000) average loss: 0.4593766927719116\n",
      "[proc 1][Train](375/100000) average regularization: 7.301699952222407e-06\n",
      "[proc 1][Train] 1 steps take 2.586 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.430, backward: 0.070, update: 2.086\n",
      "[proc 0][Train](376/100000) average pos_loss: 0.6352523565292358\n",
      "[proc 0][Train](376/100000) average neg_loss: 0.6550348997116089\n",
      "[proc 0][Train](376/100000) average loss: 0.6451436281204224\n",
      "[proc 0][Train](376/100000) average regularization: 6.935577403055504e-06\n",
      "[proc 0][Train] 1 steps take 2.626 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.432, backward: 0.070, update: 2.122\n",
      "[proc 1][Train](376/100000) average pos_loss: 0.6442034244537354\n",
      "[proc 1][Train](376/100000) average neg_loss: 0.5980439186096191\n",
      "[proc 1][Train](376/100000) average loss: 0.6211236715316772\n",
      "[proc 1][Train](376/100000) average regularization: 7.176830422395142e-06\n",
      "[proc 1][Train] 1 steps take 2.620 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.427, backward: 0.070, update: 2.121\n",
      "[proc 0][Train](377/100000) average pos_loss: 0.6890653967857361\n",
      "[proc 0][Train](377/100000) average neg_loss: 0.2820768356323242\n",
      "[proc 0][Train](377/100000) average loss: 0.48557111620903015\n",
      "[proc 0][Train](377/100000) average regularization: 7.445518804161111e-06\n",
      "[proc 0][Train] 1 steps take 2.643 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.428, backward: 0.070, update: 2.145\n",
      "[proc 1][Train](377/100000) average pos_loss: 0.6377103328704834\n",
      "[proc 1][Train](377/100000) average neg_loss: 0.3058018982410431\n",
      "[proc 1][Train](377/100000) average loss: 0.47175610065460205\n",
      "[proc 1][Train](377/100000) average regularization: 7.251964689203305e-06\n",
      "[proc 1][Train] 1 steps take 2.628 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.431, backward: 0.070, update: 2.125\n",
      "[proc 0][Train](378/100000) average pos_loss: 0.6007649302482605\n",
      "[proc 0][Train](378/100000) average neg_loss: 0.5766177773475647\n",
      "[proc 0][Train](378/100000) average loss: 0.5886913537979126\n",
      "[proc 0][Train](378/100000) average regularization: 7.158831067499705e-06\n",
      "[proc 0][Train] 1 steps take 2.579 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.429, backward: 0.069, update: 2.079\n",
      "[proc 1][Train](378/100000) average pos_loss: 0.5787488222122192\n",
      "[proc 1][Train](378/100000) average neg_loss: 0.5511820316314697\n",
      "[proc 1][Train](378/100000) average loss: 0.5649654269218445\n",
      "[proc 1][Train](378/100000) average regularization: 7.152019406930776e-06\n",
      "[proc 1][Train] 1 steps take 2.679 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.429, backward: 0.070, update: 2.179\n",
      "[proc 0][Train](379/100000) average pos_loss: 0.6342992782592773\n",
      "[proc 0][Train](379/100000) average neg_loss: 0.2912241220474243\n",
      "[proc 0][Train](379/100000) average loss: 0.46276170015335083\n",
      "[proc 0][Train](379/100000) average regularization: 7.530659786425531e-06\n",
      "[proc 0][Train] 1 steps take 2.631 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.435, backward: 0.071, update: 2.124\n",
      "[proc 1][Train](379/100000) average pos_loss: 0.6245108246803284\n",
      "[proc 1][Train](379/100000) average neg_loss: 0.30527180433273315\n",
      "[proc 1][Train](379/100000) average loss: 0.46489131450653076\n",
      "[proc 1][Train](379/100000) average regularization: 7.267432465596357e-06\n",
      "[proc 1][Train] 1 steps take 2.654 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.430, backward: 0.070, update: 2.153\n",
      "[proc 0][Train](380/100000) average pos_loss: 0.6144622564315796\n",
      "[proc 0][Train](380/100000) average neg_loss: 0.5824328064918518\n",
      "[proc 0][Train](380/100000) average loss: 0.5984475612640381\n",
      "[proc 0][Train](380/100000) average regularization: 7.039081083348719e-06\n",
      "[proc 0][Train] 1 steps take 2.686 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.438, backward: 0.069, update: 2.178\n",
      "[proc 1][Train](380/100000) average pos_loss: 0.6536551713943481\n",
      "[proc 1][Train](380/100000) average neg_loss: 0.5674306750297546\n",
      "[proc 1][Train](380/100000) average loss: 0.610542893409729\n",
      "[proc 1][Train](380/100000) average regularization: 7.182868102972861e-06\n",
      "[proc 1][Train] 1 steps take 2.588 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.423, backward: 0.070, update: 2.094\n",
      "[proc 0][Train](381/100000) average pos_loss: 0.6301239728927612\n",
      "[proc 0][Train](381/100000) average neg_loss: 0.2736437916755676\n",
      "[proc 0][Train](381/100000) average loss: 0.45188388228416443\n",
      "[proc 0][Train](381/100000) average regularization: 7.661122253921349e-06\n",
      "[proc 0][Train] 1 steps take 2.630 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.436, backward: 0.070, update: 2.123\n",
      "[proc 1][Train](381/100000) average pos_loss: 0.628083348274231\n",
      "[proc 1][Train](381/100000) average neg_loss: 0.30605465173721313\n",
      "[proc 1][Train](381/100000) average loss: 0.46706900000572205\n",
      "[proc 1][Train](381/100000) average regularization: 7.414791070914362e-06\n",
      "[proc 1][Train] 1 steps take 2.644 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.418, backward: 0.071, update: 2.154\n",
      "[proc 0][Train](382/100000) average pos_loss: 0.6221994161605835\n",
      "[proc 0][Train](382/100000) average neg_loss: 0.5764919519424438\n",
      "[proc 0][Train](382/100000) average loss: 0.5993456840515137\n",
      "[proc 0][Train](382/100000) average regularization: 7.505732810386689e-06\n",
      "[proc 0][Train] 1 steps take 2.610 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.430, backward: 0.070, update: 2.109\n",
      "[proc 1][Train](382/100000) average pos_loss: 0.6180965304374695\n",
      "[proc 1][Train](382/100000) average neg_loss: 0.5823812484741211\n",
      "[proc 1][Train](382/100000) average loss: 0.6002389192581177\n",
      "[proc 1][Train](382/100000) average regularization: 7.431121957779396e-06\n",
      "[proc 1][Train] 1 steps take 2.646 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.428, backward: 0.070, update: 2.146\n",
      "[proc 0][Train](383/100000) average pos_loss: 0.6068769693374634\n",
      "[proc 0][Train](383/100000) average neg_loss: 0.3099622130393982\n",
      "[proc 0][Train](383/100000) average loss: 0.4584195911884308\n",
      "[proc 0][Train](383/100000) average regularization: 7.522813575633336e-06\n",
      "[proc 0][Train] 1 steps take 2.631 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.428, backward: 0.070, update: 2.132\n",
      "[proc 1][Train](383/100000) average pos_loss: 0.6124466061592102\n",
      "[proc 1][Train](383/100000) average neg_loss: 0.2728123664855957\n",
      "[proc 1][Train](383/100000) average loss: 0.44262948632240295\n",
      "[proc 1][Train](383/100000) average regularization: 7.5806160566571634e-06\n",
      "[proc 1][Train] 1 steps take 2.620 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.423, backward: 0.070, update: 2.126\n",
      "[proc 0][Train](384/100000) average pos_loss: 0.6457167863845825\n",
      "[proc 0][Train](384/100000) average neg_loss: 0.6055765748023987\n",
      "[proc 0][Train](384/100000) average loss: 0.625646710395813\n",
      "[proc 0][Train](384/100000) average regularization: 7.322506917262217e-06\n",
      "[proc 0][Train] 1 steps take 2.645 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.434, backward: 0.069, update: 2.140\n",
      "[proc 1][Train](384/100000) average pos_loss: 0.641714334487915\n",
      "[proc 1][Train](384/100000) average neg_loss: 0.5766851305961609\n",
      "[proc 1][Train](384/100000) average loss: 0.6091997623443604\n",
      "[proc 1][Train](384/100000) average regularization: 7.465285762009444e-06\n",
      "[proc 1][Train] 1 steps take 2.562 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.431, backward: 0.070, update: 2.061\n",
      "[proc 0][Train](385/100000) average pos_loss: 0.6505687832832336\n",
      "[proc 0][Train](385/100000) average neg_loss: 0.29973340034484863\n",
      "[proc 0][Train](385/100000) average loss: 0.47515109181404114\n",
      "[proc 0][Train](385/100000) average regularization: 7.421750069624977e-06\n",
      "[proc 0][Train] 1 steps take 2.680 seconds\n",
      "[proc 0]sample: 0.016, forward: 0.430, backward: 0.069, update: 2.165\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 1][Train](385/100000) average pos_loss: 0.6040066480636597\n",
      "[proc 1][Train](385/100000) average neg_loss: 0.31398820877075195\n",
      "[proc 1][Train](385/100000) average loss: 0.4589974284172058\n",
      "[proc 1][Train](385/100000) average regularization: 7.438658940372989e-06\n",
      "[proc 1][Train] 1 steps take 2.852 seconds\n",
      "[proc 1]sample: 0.017, forward: 0.433, backward: 0.069, update: 2.332\n",
      "[proc 0][Train](386/100000) average pos_loss: 0.6193631887435913\n",
      "[proc 0][Train](386/100000) average neg_loss: 0.6014052629470825\n",
      "[proc 0][Train](386/100000) average loss: 0.6103842258453369\n",
      "[proc 0][Train](386/100000) average regularization: 7.563005965494085e-06\n",
      "[proc 0][Train] 1 steps take 2.739 seconds\n",
      "[proc 0]sample: 0.018, forward: 0.446, backward: 0.070, update: 2.205\n",
      "[proc 1][Train](386/100000) average pos_loss: 0.6185507774353027\n",
      "[proc 1][Train](386/100000) average neg_loss: 0.5772465467453003\n",
      "[proc 1][Train](386/100000) average loss: 0.5978986620903015\n",
      "[proc 1][Train](386/100000) average regularization: 7.446174549841089e-06\n",
      "[proc 1][Train] 1 steps take 2.681 seconds\n",
      "[proc 1]sample: 0.014, forward: 0.433, backward: 0.070, update: 2.162\n",
      "[proc 0][Train](387/100000) average pos_loss: 0.6272358894348145\n",
      "[proc 0][Train](387/100000) average neg_loss: 0.305808424949646\n",
      "[proc 0][Train](387/100000) average loss: 0.4665221571922302\n",
      "[proc 0][Train](387/100000) average regularization: 7.380945135082584e-06\n",
      "[proc 0][Train] 1 steps take 2.605 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.433, backward: 0.070, update: 2.101\n",
      "[proc 1][Train](387/100000) average pos_loss: 0.6465606689453125\n",
      "[proc 1][Train](387/100000) average neg_loss: 0.28814932703971863\n",
      "[proc 1][Train](387/100000) average loss: 0.46735501289367676\n",
      "[proc 1][Train](387/100000) average regularization: 7.479946816602023e-06\n",
      "[proc 1][Train] 1 steps take 2.643 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.425, backward: 0.070, update: 2.146\n",
      "[proc 0][Train](388/100000) average pos_loss: 0.5924848914146423\n",
      "[proc 0][Train](388/100000) average neg_loss: 0.5483132004737854\n",
      "[proc 0][Train](388/100000) average loss: 0.5703990459442139\n",
      "[proc 0][Train](388/100000) average regularization: 7.2329557951889e-06\n",
      "[proc 0][Train] 1 steps take 2.659 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.444, backward: 0.069, update: 2.144\n",
      "[proc 1][Train](388/100000) average pos_loss: 0.643364667892456\n",
      "[proc 1][Train](388/100000) average neg_loss: 0.5195843577384949\n",
      "[proc 1][Train](388/100000) average loss: 0.5814745426177979\n",
      "[proc 1][Train](388/100000) average regularization: 7.392452516796766e-06\n",
      "[proc 1][Train] 1 steps take 2.505 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.417, backward: 0.070, update: 2.016\n",
      "[proc 0][Train](389/100000) average pos_loss: 0.6232085227966309\n",
      "[proc 0][Train](389/100000) average neg_loss: 0.29618871212005615\n",
      "[proc 0][Train](389/100000) average loss: 0.4596986174583435\n",
      "[proc 0][Train](389/100000) average regularization: 7.5697175816458184e-06\n",
      "[proc 0][Train] 1 steps take 2.763 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.435, backward: 0.070, update: 2.256\n",
      "[proc 1][Train](389/100000) average pos_loss: 0.6335325241088867\n",
      "[proc 1][Train](389/100000) average neg_loss: 0.28530657291412354\n",
      "[proc 1][Train](389/100000) average loss: 0.4594195485115051\n",
      "[proc 1][Train](389/100000) average regularization: 7.377926067420049e-06\n",
      "[proc 1][Train] 1 steps take 2.681 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.418, backward: 0.070, update: 2.192\n",
      "[proc 0][Train](390/100000) average pos_loss: 0.580622673034668\n",
      "[proc 0][Train](390/100000) average neg_loss: 0.5688508152961731\n",
      "[proc 0][Train](390/100000) average loss: 0.5747367143630981\n",
      "[proc 0][Train](390/100000) average regularization: 7.106531029421603e-06\n",
      "[proc 0][Train] 1 steps take 2.751 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.439, backward: 0.070, update: 2.240\n",
      "[proc 1][Train](390/100000) average pos_loss: 0.6246253252029419\n",
      "[proc 1][Train](390/100000) average neg_loss: 0.5375497341156006\n",
      "[proc 1][Train](390/100000) average loss: 0.5810875296592712\n",
      "[proc 1][Train](390/100000) average regularization: 7.131063284759875e-06\n",
      "[proc 1][Train] 1 steps take 2.645 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.400, backward: 0.070, update: 2.174\n",
      "[proc 0][Train](391/100000) average pos_loss: 0.6493849754333496\n",
      "[proc 0][Train](391/100000) average neg_loss: 0.29616737365722656\n",
      "[proc 0][Train](391/100000) average loss: 0.4727761745452881\n",
      "[proc 0][Train](391/100000) average regularization: 7.471910066669807e-06\n",
      "[proc 0][Train] 1 steps take 2.672 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.429, backward: 0.070, update: 2.172\n",
      "[proc 1][Train](391/100000) average pos_loss: 0.6397003531455994\n",
      "[proc 1][Train](391/100000) average neg_loss: 0.2932855486869812\n",
      "[proc 1][Train](391/100000) average loss: 0.4664929509162903\n",
      "[proc 1][Train](391/100000) average regularization: 7.730901415925473e-06\n",
      "[proc 1][Train] 1 steps take 2.800 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.427, backward: 0.070, update: 2.301\n",
      "[proc 0][Train](392/100000) average pos_loss: 0.6112022399902344\n",
      "[proc 0][Train](392/100000) average neg_loss: 0.600969135761261\n",
      "[proc 0][Train](392/100000) average loss: 0.6060856580734253\n",
      "[proc 0][Train](392/100000) average regularization: 7.547672339569544e-06\n",
      "[proc 0][Train] 1 steps take 2.711 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.432, backward: 0.070, update: 2.207\n",
      "[proc 1][Train](392/100000) average pos_loss: 0.6203914880752563\n",
      "[proc 1][Train](392/100000) average neg_loss: 0.567118227481842\n",
      "[proc 1][Train](392/100000) average loss: 0.5937548875808716\n",
      "[proc 1][Train](392/100000) average regularization: 7.571180049126269e-06\n",
      "[proc 1][Train] 1 steps take 2.645 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.439, backward: 0.070, update: 2.135\n",
      "[proc 0][Train](393/100000) average pos_loss: 0.6171537637710571\n",
      "[proc 0][Train](393/100000) average neg_loss: 0.2963966131210327\n",
      "[proc 0][Train](393/100000) average loss: 0.4567751884460449\n",
      "[proc 0][Train](393/100000) average regularization: 7.675488632230554e-06\n",
      "[proc 0][Train] 1 steps take 2.655 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.441, backward: 0.070, update: 2.142\n",
      "[proc 1][Train](393/100000) average pos_loss: 0.6344968676567078\n",
      "[proc 1][Train](393/100000) average neg_loss: 0.28647446632385254\n",
      "[proc 1][Train](393/100000) average loss: 0.46048566699028015\n",
      "[proc 1][Train](393/100000) average regularization: 7.406042641378008e-06\n",
      "[proc 1][Train] 1 steps take 2.722 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.446, backward: 0.070, update: 2.205\n",
      "[proc 0][Train](394/100000) average pos_loss: 0.6199824810028076\n",
      "[proc 0][Train](394/100000) average neg_loss: 0.550342321395874\n",
      "[proc 0][Train](394/100000) average loss: 0.5851624011993408\n",
      "[proc 0][Train](394/100000) average regularization: 7.529826234531356e-06\n",
      "[proc 0][Train] 1 steps take 2.761 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.437, backward: 0.071, update: 2.251\n",
      "[proc 1][Train](394/100000) average pos_loss: 0.5879998207092285\n",
      "[proc 1][Train](394/100000) average neg_loss: 0.5951244831085205\n",
      "[proc 1][Train](394/100000) average loss: 0.5915621519088745\n",
      "[proc 1][Train](394/100000) average regularization: 7.785790330672171e-06\n",
      "[proc 1][Train] 1 steps take 2.813 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.450, backward: 0.070, update: 2.292\n",
      "[proc 0][Train](395/100000) average pos_loss: 0.6067347526550293\n",
      "[proc 0][Train](395/100000) average neg_loss: 0.3079266846179962\n",
      "[proc 0][Train](395/100000) average loss: 0.45733070373535156\n",
      "[proc 0][Train](395/100000) average regularization: 7.501079835492419e-06\n",
      "[proc 0][Train] 1 steps take 2.678 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.427, backward: 0.070, update: 2.180\n",
      "[proc 1][Train](395/100000) average pos_loss: 0.6042662858963013\n",
      "[proc 1][Train](395/100000) average neg_loss: 0.3002420663833618\n",
      "[proc 1][Train](395/100000) average loss: 0.45225417613983154\n",
      "[proc 1][Train](395/100000) average regularization: 7.688637197134085e-06\n",
      "[proc 1][Train] 1 steps take 2.598 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.434, backward: 0.069, update: 2.093\n",
      "[proc 0][Train](396/100000) average pos_loss: 0.6036856770515442\n",
      "[proc 0][Train](396/100000) average neg_loss: 0.5818673968315125\n",
      "[proc 0][Train](396/100000) average loss: 0.5927765369415283\n",
      "[proc 0][Train](396/100000) average regularization: 7.600860044476576e-06\n",
      "[proc 0][Train] 1 steps take 2.796 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.488, backward: 0.071, update: 2.235\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 1][Train](396/100000) average pos_loss: 0.6240627765655518\n",
      "[proc 1][Train](396/100000) average neg_loss: 0.5802360773086548\n",
      "[proc 1][Train](396/100000) average loss: 0.6021494269371033\n",
      "[proc 1][Train](396/100000) average regularization: 7.480491603928385e-06\n",
      "[proc 1][Train] 1 steps take 2.736 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.435, backward: 0.070, update: 2.229\n",
      "[proc 0][Train](397/100000) average pos_loss: 0.6604918837547302\n",
      "[proc 0][Train](397/100000) average neg_loss: 0.2610642611980438\n",
      "[proc 0][Train](397/100000) average loss: 0.46077805757522583\n",
      "[proc 0][Train](397/100000) average regularization: 7.804143933753949e-06\n",
      "[proc 0][Train] 1 steps take 2.717 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.435, backward: 0.070, update: 2.209\n",
      "[proc 1][Train](397/100000) average pos_loss: 0.6018226146697998\n",
      "[proc 1][Train](397/100000) average neg_loss: 0.29126453399658203\n",
      "[proc 1][Train](397/100000) average loss: 0.4465435743331909\n",
      "[proc 1][Train](397/100000) average regularization: 7.283984359673923e-06\n",
      "[proc 1][Train] 1 steps take 2.756 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.441, backward: 0.070, update: 2.244\n",
      "[proc 0][Train](398/100000) average pos_loss: 0.6328881978988647\n",
      "[proc 0][Train](398/100000) average neg_loss: 0.5955314040184021\n",
      "[proc 0][Train](398/100000) average loss: 0.614209771156311\n",
      "[proc 0][Train](398/100000) average regularization: 7.44865792512428e-06\n",
      "[proc 0][Train] 1 steps take 2.769 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.434, backward: 0.070, update: 2.264\n",
      "[proc 1][Train](398/100000) average pos_loss: 0.5959726572036743\n",
      "[proc 1][Train](398/100000) average neg_loss: 0.5809820294380188\n",
      "[proc 1][Train](398/100000) average loss: 0.588477373123169\n",
      "[proc 1][Train](398/100000) average regularization: 7.493247267120751e-06\n",
      "[proc 1][Train] 1 steps take 2.781 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.433, backward: 0.069, update: 2.278\n",
      "[proc 0][Train](399/100000) average pos_loss: 0.6459255218505859\n",
      "[proc 0][Train](399/100000) average neg_loss: 0.31096333265304565\n",
      "[proc 0][Train](399/100000) average loss: 0.4784444272518158\n",
      "[proc 0][Train](399/100000) average regularization: 7.539502803410869e-06\n",
      "[proc 0][Train] 1 steps take 2.698 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.432, backward: 0.071, update: 2.194\n",
      "[proc 1][Train](399/100000) average pos_loss: 0.5949968695640564\n",
      "[proc 1][Train](399/100000) average neg_loss: 0.2761344611644745\n",
      "[proc 1][Train](399/100000) average loss: 0.43556565046310425\n",
      "[proc 1][Train](399/100000) average regularization: 7.210766852949746e-06\n",
      "[proc 1][Train] 1 steps take 2.685 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.459, backward: 0.070, update: 2.154\n",
      "[proc 0][Train](400/100000) average pos_loss: 0.6332982778549194\n",
      "[proc 0][Train](400/100000) average neg_loss: 0.5746578574180603\n",
      "[proc 0][Train](400/100000) average loss: 0.6039780378341675\n",
      "[proc 0][Train](400/100000) average regularization: 7.4257823143852875e-06\n",
      "[proc 0][Train] 1 steps take 2.698 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.428, backward: 0.069, update: 2.199\n",
      "[proc 1][Train](400/100000) average pos_loss: 0.5891938209533691\n",
      "[proc 1][Train](400/100000) average neg_loss: 0.5707236528396606\n",
      "[proc 1][Train](400/100000) average loss: 0.5799587368965149\n",
      "[proc 1][Train](400/100000) average regularization: 7.73118699726183e-06\n",
      "[proc 1][Train] 1 steps take 2.698 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.430, backward: 0.071, update: 2.195\n",
      "[proc 0][Train](401/100000) average pos_loss: 0.6077752113342285\n",
      "[proc 0][Train](401/100000) average neg_loss: 0.26222264766693115\n",
      "[proc 0][Train](401/100000) average loss: 0.43499892950057983\n",
      "[proc 0][Train](401/100000) average regularization: 7.646396625204943e-06\n",
      "[proc 0][Train] 1 steps take 2.684 seconds\n",
      "[proc 0]sample: 0.014, forward: 0.430, backward: 0.069, update: 2.170\n",
      "[proc 1][Train](401/100000) average pos_loss: 0.5991768836975098\n",
      "[proc 1][Train](401/100000) average neg_loss: 0.28161266446113586\n",
      "[proc 1][Train](401/100000) average loss: 0.4403947591781616\n",
      "[proc 1][Train](401/100000) average regularization: 7.1539011514687445e-06\n",
      "[proc 1][Train] 1 steps take 2.716 seconds\n",
      "[proc 1]sample: 0.018, forward: 0.458, backward: 0.070, update: 2.169\n",
      "[proc 0][Train](402/100000) average pos_loss: 0.6223708391189575\n",
      "[proc 0][Train](402/100000) average neg_loss: 0.617048442363739\n",
      "[proc 0][Train](402/100000) average loss: 0.6197096109390259\n",
      "[proc 0][Train](402/100000) average regularization: 7.379469479928957e-06\n",
      "[proc 0][Train] 1 steps take 2.859 seconds\n",
      "[proc 0]sample: 0.017, forward: 0.514, backward: 0.074, update: 2.254\n",
      "[proc 1][Train](402/100000) average pos_loss: 0.6013296842575073\n",
      "[proc 1][Train](402/100000) average neg_loss: 0.5550758838653564\n",
      "[proc 1][Train](402/100000) average loss: 0.5782027840614319\n",
      "[proc 1][Train](402/100000) average regularization: 7.4387312452017795e-06\n",
      "[proc 1][Train] 1 steps take 2.802 seconds\n",
      "[proc 1]sample: 0.014, forward: 0.441, backward: 0.069, update: 2.277\n",
      "[proc 0][Train](403/100000) average pos_loss: 0.6334225535392761\n",
      "[proc 0][Train](403/100000) average neg_loss: 0.3139339089393616\n",
      "[proc 0][Train](403/100000) average loss: 0.47367823123931885\n",
      "[proc 0][Train](403/100000) average regularization: 7.333038411161397e-06\n",
      "[proc 0][Train] 1 steps take 2.665 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.434, backward: 0.071, update: 2.158\n",
      "[proc 1][Train](403/100000) average pos_loss: 0.6040574312210083\n",
      "[proc 1][Train](403/100000) average neg_loss: 0.30070722103118896\n",
      "[proc 1][Train](403/100000) average loss: 0.45238232612609863\n",
      "[proc 1][Train](403/100000) average regularization: 7.516499408666277e-06\n",
      "[proc 1][Train] 1 steps take 2.682 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.432, backward: 0.069, update: 2.179\n",
      "[proc 0][Train](404/100000) average pos_loss: 0.5886944532394409\n",
      "[proc 0][Train](404/100000) average neg_loss: 0.5728620886802673\n",
      "[proc 0][Train](404/100000) average loss: 0.5807782411575317\n",
      "[proc 0][Train](404/100000) average regularization: 7.7426630014088e-06\n",
      "[proc 0][Train] 1 steps take 2.640 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.434, backward: 0.070, update: 2.134\n",
      "[proc 1][Train](404/100000) average pos_loss: 0.6064854264259338\n",
      "[proc 1][Train](404/100000) average neg_loss: 0.5787860155105591\n",
      "[proc 1][Train](404/100000) average loss: 0.5926357507705688\n",
      "[proc 1][Train](404/100000) average regularization: 7.653584361833055e-06\n",
      "[proc 1][Train] 1 steps take 2.664 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.434, backward: 0.070, update: 2.158\n",
      "[proc 0][Train](405/100000) average pos_loss: 0.6128147840499878\n",
      "[proc 0][Train](405/100000) average neg_loss: 0.2976802587509155\n",
      "[proc 0][Train](405/100000) average loss: 0.45524752140045166\n",
      "[proc 0][Train](405/100000) average regularization: 7.743168680462986e-06\n",
      "[proc 0][Train] 1 steps take 2.601 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.436, backward: 0.070, update: 2.094\n",
      "[proc 1][Train](405/100000) average pos_loss: 0.6329436302185059\n",
      "[proc 1][Train](405/100000) average neg_loss: 0.32058969140052795\n",
      "[proc 1][Train](405/100000) average loss: 0.4767666459083557\n",
      "[proc 1][Train](405/100000) average regularization: 7.339070180023555e-06\n",
      "[proc 1][Train] 1 steps take 2.674 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.427, backward: 0.070, update: 2.175\n",
      "[proc 0][Train](406/100000) average pos_loss: 0.6111011505126953\n",
      "[proc 0][Train](406/100000) average neg_loss: 0.6013743281364441\n",
      "[proc 0][Train](406/100000) average loss: 0.6062377691268921\n",
      "[proc 0][Train](406/100000) average regularization: 7.397920398943825e-06\n",
      "[proc 0][Train] 1 steps take 2.690 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.432, backward: 0.069, update: 2.187\n",
      "[proc 1][Train](406/100000) average pos_loss: 0.6047340035438538\n",
      "[proc 1][Train](406/100000) average neg_loss: 0.5873189568519592\n",
      "[proc 1][Train](406/100000) average loss: 0.5960264801979065\n",
      "[proc 1][Train](406/100000) average regularization: 7.650276529602706e-06\n",
      "[proc 1][Train] 1 steps take 2.670 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.432, backward: 0.070, update: 2.167\n",
      "[proc 0][Train](407/100000) average pos_loss: 0.6237204670906067\n",
      "[proc 0][Train](407/100000) average neg_loss: 0.27547311782836914\n",
      "[proc 0][Train](407/100000) average loss: 0.4495967924594879\n",
      "[proc 0][Train](407/100000) average regularization: 7.916984941402916e-06\n",
      "[proc 0][Train] 1 steps take 2.630 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.434, backward: 0.071, update: 2.124\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 1][Train](407/100000) average pos_loss: 0.6166107654571533\n",
      "[proc 1][Train](407/100000) average neg_loss: 0.29671019315719604\n",
      "[proc 1][Train](407/100000) average loss: 0.4566604793071747\n",
      "[proc 1][Train](407/100000) average regularization: 7.211967385956086e-06\n",
      "[proc 1][Train] 1 steps take 2.668 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.425, backward: 0.069, update: 2.172\n",
      "[proc 0][Train](408/100000) average pos_loss: 0.6410422325134277\n",
      "[proc 0][Train](408/100000) average neg_loss: 0.5511144399642944\n",
      "[proc 0][Train](408/100000) average loss: 0.5960783362388611\n",
      "[proc 0][Train](408/100000) average regularization: 7.563546660094289e-06\n",
      "[proc 0][Train] 1 steps take 2.638 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.440, backward: 0.070, update: 2.126\n",
      "[proc 1][Train](408/100000) average pos_loss: 0.605036735534668\n",
      "[proc 1][Train](408/100000) average neg_loss: 0.5816142559051514\n",
      "[proc 1][Train](408/100000) average loss: 0.5933254957199097\n",
      "[proc 1][Train](408/100000) average regularization: 7.529004051320953e-06\n",
      "[proc 1][Train] 1 steps take 2.657 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.423, backward: 0.070, update: 2.162\n",
      "[proc 0][Train](409/100000) average pos_loss: 0.619236946105957\n",
      "[proc 0][Train](409/100000) average neg_loss: 0.3027119040489197\n",
      "[proc 0][Train](409/100000) average loss: 0.46097442507743835\n",
      "[proc 0][Train](409/100000) average regularization: 7.260124220920261e-06\n",
      "[proc 0][Train] 1 steps take 2.598 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.429, backward: 0.070, update: 2.097\n",
      "[proc 1][Train](409/100000) average pos_loss: 0.6180082559585571\n",
      "[proc 1][Train](409/100000) average neg_loss: 0.3011159300804138\n",
      "[proc 1][Train](409/100000) average loss: 0.4595620930194855\n",
      "[proc 1][Train](409/100000) average regularization: 7.577420092275133e-06\n",
      "[proc 1][Train] 1 steps take 2.678 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.435, backward: 0.070, update: 2.172\n",
      "[proc 0][Train](410/100000) average pos_loss: 0.5814530849456787\n",
      "[proc 0][Train](410/100000) average neg_loss: 0.6020110845565796\n",
      "[proc 0][Train](410/100000) average loss: 0.5917320847511292\n",
      "[proc 0][Train](410/100000) average regularization: 7.872810783737805e-06\n",
      "[proc 0][Train] 1 steps take 2.681 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.435, backward: 0.069, update: 2.176\n",
      "[proc 1][Train](410/100000) average pos_loss: 0.6002861261367798\n",
      "[proc 1][Train](410/100000) average neg_loss: 0.5423023700714111\n",
      "[proc 1][Train](410/100000) average loss: 0.5712942481040955\n",
      "[proc 1][Train](410/100000) average regularization: 7.705508323851973e-06\n",
      "[proc 1][Train] 1 steps take 2.669 seconds\n",
      "[proc 1]sample: 0.002, forward: 0.438, backward: 0.070, update: 2.159\n",
      "[proc 0][Train](411/100000) average pos_loss: 0.6061726212501526\n",
      "[proc 0][Train](411/100000) average neg_loss: 0.2988744378089905\n",
      "[proc 0][Train](411/100000) average loss: 0.45252352952957153\n",
      "[proc 0][Train](411/100000) average regularization: 7.608537998748943e-06\n",
      "[proc 0][Train] 1 steps take 2.633 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.443, backward: 0.070, update: 2.118\n",
      "[proc 1][Train](411/100000) average pos_loss: 0.5846823453903198\n",
      "[proc 1][Train](411/100000) average neg_loss: 0.2932998836040497\n",
      "[proc 1][Train](411/100000) average loss: 0.43899112939834595\n",
      "[proc 1][Train](411/100000) average regularization: 7.738875865470618e-06\n",
      "[proc 1][Train] 1 steps take 2.581 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.426, backward: 0.070, update: 2.084\n",
      "[proc 0][Train](412/100000) average pos_loss: 0.5744675397872925\n",
      "[proc 0][Train](412/100000) average neg_loss: 0.6160741448402405\n",
      "[proc 0][Train](412/100000) average loss: 0.5952708721160889\n",
      "[proc 0][Train](412/100000) average regularization: 7.2608231675985735e-06\n",
      "[proc 0][Train] 1 steps take 2.815 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.468, backward: 0.070, update: 2.275\n",
      "[proc 1][Train](412/100000) average pos_loss: 0.6129744648933411\n",
      "[proc 1][Train](412/100000) average neg_loss: 0.5266462564468384\n",
      "[proc 1][Train](412/100000) average loss: 0.5698103904724121\n",
      "[proc 1][Train](412/100000) average regularization: 7.74577165429946e-06\n",
      "[proc 1][Train] 1 steps take 2.657 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.404, backward: 0.069, update: 2.184\n",
      "[proc 0][Train](413/100000) average pos_loss: 0.6185044646263123\n",
      "[proc 0][Train](413/100000) average neg_loss: 0.2709421217441559\n",
      "[proc 0][Train](413/100000) average loss: 0.44472330808639526\n",
      "[proc 0][Train](413/100000) average regularization: 7.615465619892348e-06\n",
      "[proc 0][Train] 1 steps take 2.711 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.417, backward: 0.070, update: 2.222\n",
      "[proc 1][Train](413/100000) average pos_loss: 0.6259509921073914\n",
      "[proc 1][Train](413/100000) average neg_loss: 0.30723172426223755\n",
      "[proc 1][Train](413/100000) average loss: 0.46659135818481445\n",
      "[proc 1][Train](413/100000) average regularization: 7.473588539141929e-06\n",
      "[proc 1][Train] 1 steps take 2.668 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.437, backward: 0.070, update: 2.160\n",
      "[proc 0][Train](414/100000) average pos_loss: 0.6175152063369751\n",
      "[proc 0][Train](414/100000) average neg_loss: 0.5421959161758423\n",
      "[proc 0][Train](414/100000) average loss: 0.5798555612564087\n",
      "[proc 0][Train](414/100000) average regularization: 7.420124347845558e-06\n",
      "[proc 0][Train] 1 steps take 2.829 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.431, backward: 0.070, update: 2.327\n",
      "[proc 1][Train](414/100000) average pos_loss: 0.5832828283309937\n",
      "[proc 1][Train](414/100000) average neg_loss: 0.5578058362007141\n",
      "[proc 1][Train](414/100000) average loss: 0.5705443620681763\n",
      "[proc 1][Train](414/100000) average regularization: 7.647197890037205e-06\n",
      "[proc 1][Train] 1 steps take 2.780 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.450, backward: 0.069, update: 2.260\n",
      "[proc 0][Train](415/100000) average pos_loss: 0.6266114115715027\n",
      "[proc 0][Train](415/100000) average neg_loss: 0.3070647716522217\n",
      "[proc 0][Train](415/100000) average loss: 0.4668380916118622\n",
      "[proc 0][Train](415/100000) average regularization: 7.619454208906973e-06\n",
      "[proc 0][Train] 1 steps take 2.730 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.418, backward: 0.070, update: 2.240\n",
      "[proc 1][Train](415/100000) average pos_loss: 0.5946482419967651\n",
      "[proc 1][Train](415/100000) average neg_loss: 0.293972909450531\n",
      "[proc 1][Train](415/100000) average loss: 0.44431057572364807\n",
      "[proc 1][Train](415/100000) average regularization: 7.676269888179377e-06\n",
      "[proc 1][Train] 1 steps take 2.759 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.443, backward: 0.069, update: 2.246\n",
      "[proc 0][Train](416/100000) average pos_loss: 0.6062936186790466\n",
      "[proc 0][Train](416/100000) average neg_loss: 0.536469042301178\n",
      "[proc 0][Train](416/100000) average loss: 0.5713813304901123\n",
      "[proc 0][Train](416/100000) average regularization: 7.4549529927026015e-06\n",
      "[proc 0][Train] 1 steps take 2.751 seconds\n",
      "[proc 0]sample: 0.002, forward: 0.431, backward: 0.070, update: 2.248\n",
      "[proc 1][Train](416/100000) average pos_loss: 0.6043245196342468\n",
      "[proc 1][Train](416/100000) average neg_loss: 0.5683133602142334\n",
      "[proc 1][Train](416/100000) average loss: 0.5863189697265625\n",
      "[proc 1][Train](416/100000) average regularization: 7.51606967241969e-06\n",
      "[proc 1][Train] 1 steps take 2.723 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.446, backward: 0.070, update: 2.206\n",
      "[proc 0][Train](417/100000) average pos_loss: 0.6540689468383789\n",
      "[proc 0][Train](417/100000) average neg_loss: 0.27405306696891785\n",
      "[proc 0][Train](417/100000) average loss: 0.46406102180480957\n",
      "[proc 0][Train](417/100000) average regularization: 7.80378013587324e-06\n",
      "[proc 0][Train] 1 steps take 2.606 seconds\n",
      "[proc 0]sample: 0.016, forward: 0.430, backward: 0.071, update: 2.088\n",
      "[proc 1][Train](417/100000) average pos_loss: 0.6054248809814453\n",
      "[proc 1][Train](417/100000) average neg_loss: 0.2606237530708313\n",
      "[proc 1][Train](417/100000) average loss: 0.4330243170261383\n",
      "[proc 1][Train](417/100000) average regularization: 7.665697921765968e-06\n",
      "[proc 1][Train] 1 steps take 2.737 seconds\n",
      "[proc 1]sample: 0.018, forward: 0.444, backward: 0.069, update: 2.206\n",
      "[proc 0][Train](418/100000) average pos_loss: 0.6190648078918457\n",
      "[proc 0][Train](418/100000) average neg_loss: 0.5699795484542847\n",
      "[proc 0][Train](418/100000) average loss: 0.5945221781730652\n",
      "[proc 0][Train](418/100000) average regularization: 7.248577276186552e-06\n",
      "[proc 0][Train] 1 steps take 2.705 seconds\n",
      "[proc 0]sample: 0.019, forward: 0.435, backward: 0.071, update: 2.181\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[proc 1][Train](418/100000) average pos_loss: 0.5988584756851196\n",
      "[proc 1][Train](418/100000) average neg_loss: 0.6242907047271729\n",
      "[proc 1][Train](418/100000) average loss: 0.6115745902061462\n",
      "[proc 1][Train](418/100000) average regularization: 7.324012130993651e-06\n",
      "[proc 1][Train] 1 steps take 2.695 seconds\n",
      "[proc 1]sample: 0.017, forward: 0.431, backward: 0.070, update: 2.178\n",
      "[proc 0][Train](419/100000) average pos_loss: 0.6199166774749756\n",
      "[proc 0][Train](419/100000) average neg_loss: 0.2694475054740906\n",
      "[proc 0][Train](419/100000) average loss: 0.4446820914745331\n",
      "[proc 0][Train](419/100000) average regularization: 7.71273971622577e-06\n",
      "[proc 0][Train] 1 steps take 2.680 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.432, backward: 0.071, update: 2.175\n",
      "[proc 1][Train](419/100000) average pos_loss: 0.6110307574272156\n",
      "[proc 1][Train](419/100000) average neg_loss: 0.2961500883102417\n",
      "[proc 1][Train](419/100000) average loss: 0.45359042286872864\n",
      "[proc 1][Train](419/100000) average regularization: 7.4005524766107555e-06\n",
      "[proc 1][Train] 1 steps take 2.724 seconds\n",
      "[proc 1]sample: 0.001, forward: 0.432, backward: 0.069, update: 2.221\n",
      "^C\n",
      "Process Process-2:1:\n",
      "Process Process-1:1:\n",
      "Traceback (most recent call last):\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/bin/dglke_train\", line 8, in <module>\n",
      "    sys.exit(main())\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/site-packages/dglke/train.py\", line 281, in main\n",
      "    proc.join()\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/process.py\", line 149, in join\n",
      "    res = self._popen.wait(timeout)\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/popen_fork.py\", line 47, in wait\n",
      "    return self.poll(os.WNOHANG if timeout == 0.0 else 0)\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/popen_fork.py\", line 27, in poll\n",
      "    pid, sts = os.waitpid(self.pid, flag)\n",
      "KeyboardInterrupt\n",
      "Traceback (most recent call last):\n",
      "Traceback (most recent call last):\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/process.py\", line 315, in _bootstrap\n",
      "    self.run()\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/process.py\", line 315, in _bootstrap\n",
      "    self.run()\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/process.py\", line 108, in run\n",
      "    self._target(*self._args, **self._kwargs)\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/process.py\", line 108, in run\n",
      "    self._target(*self._args, **self._kwargs)\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/site-packages/dglke/models/pytorch/tensor_models.py\", line 119, in decorated_function\n",
      "    result, exception, trace = queue.get()\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/site-packages/dglke/models/pytorch/tensor_models.py\", line 119, in decorated_function\n",
      "    result, exception, trace = queue.get()\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/queues.py\", line 97, in get\n",
      "    res = self._recv_bytes()\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/queues.py\", line 97, in get\n",
      "    res = self._recv_bytes()\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/connection.py\", line 216, in recv_bytes\n",
      "    buf = self._recv_bytes(maxlength)\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/connection.py\", line 216, in recv_bytes\n",
      "    buf = self._recv_bytes(maxlength)\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/connection.py\", line 414, in _recv_bytes\n",
      "    buf = self._recv(4)\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/connection.py\", line 414, in _recv_bytes\n",
      "    buf = self._recv(4)\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/connection.py\", line 379, in _recv\n",
      "    chunk = read(handle, remaining)\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/connection.py\", line 379, in _recv\n",
      "    chunk = read(handle, remaining)\n",
      "KeyboardInterrupt\n",
      "KeyboardInterrupt\n",
      "[proc 0][Train](420/100000) average pos_loss: 0.5972341299057007\n",
      "[proc 0][Train](420/100000) average neg_loss: 0.527315616607666\n",
      "[proc 0][Train](420/100000) average loss: 0.5622748732566833\n",
      "[proc 0][Train](420/100000) average regularization: 7.475311576854438e-06\n",
      "[proc 0][Train] 1 steps take 2.680 seconds\n",
      "[proc 0]sample: 0.001, forward: 0.441, backward: 0.070, update: 2.168\n",
      "Process Process-1:\n",
      "Traceback (most recent call last):\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/process.py\", line 315, in _bootstrap\n",
      "    self.run()\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/process.py\", line 108, in run\n",
      "    self._target(*self._args, **self._kwargs)\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/site-packages/dglke/models/pytorch/tensor_models.py\", line 119, in decorated_function\n",
      "    result, exception, trace = queue.get()\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/queues.py\", line 97, in get\n",
      "    res = self._recv_bytes()\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/connection.py\", line 216, in recv_bytes\n",
      "    buf = self._recv_bytes(maxlength)\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/connection.py\", line 414, in _recv_bytes\n",
      "    buf = self._recv(4)\n",
      "  File \"/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/multiprocessing/connection.py\", line 379, in _recv\n",
      "    chunk = read(handle, remaining)\n",
      "KeyboardInterrupt\n"
     ]
    }
   ],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name TransR \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \\\n",
    "--gamma 6.0 --lr 0.01 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 1 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: **200**, 400\n",
    "\n",
    "- gamma: **6**, 12, 18\n",
    "\n",
    "- lr: 0.01, **0.05**, 0.1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Reading train triples....\n",
      "Finished. Read 5286834 train triples.\n",
      "Reading valid triples....\n",
      "Finished. Read 293713 valid triples.\n",
      "Reading test triples....\n",
      "Finished. Read 293714 test triples.\n",
      "|Train|: 5286834\n",
      "random partition 5286834 edges into 2 parts\n",
      "part 0 has 2643417 edges\n",
      "part 1 has 2643417 edges\n",
      "/home/luyanfeng/miniconda3/envs/drkg/lib/python3.8/site-packages/dgl/base.py:25: UserWarning: multigraph will be deprecated.DGL will treat all graphs as multigraph in the future.\n",
      "  warnings.warn(msg, warn_type)\n",
      "|valid|: 293713\n",
      "|test|: 293714\n",
      "Total initialize time 16.447 seconds\n"
     ]
    }
   ],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name TransR \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \\\n",
    "--gamma 6.0 --lr 0.05 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: **200**, 400\n",
    "\n",
    "- gamma: **6**, 12, 18\n",
    "\n",
    "- lr: 0.01, 0.05, **0.1**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name TransR \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \\\n",
    "--gamma 6.0 --lr 0.1 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 4\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: **200**, 400\n",
    "\n",
    "- gamma: 6, **12**, 18\n",
    "\n",
    "- lr: **0.01**, 0.05, 0.1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name TransR \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \\\n",
    "--gamma 12.0 --lr 0.01 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 5\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: **200**, 400\n",
    "\n",
    "- gamma: 6, **12**, 18\n",
    "\n",
    "- lr: 0.01, **0.05**, 0.1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name TransR \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \\\n",
    "--gamma 12.0 --lr 0.05 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 6\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: **200**, 400\n",
    "\n",
    "- gamma: 6, **12**, 18\n",
    "\n",
    "- lr: 0.01, 0.05, **0.1**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name TransR \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \\\n",
    "--gamma 12.0 --lr 0.1 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 7\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: **200**, 400\n",
    "\n",
    "- gamma: 6, 12, **18**\n",
    "\n",
    "- lr: **0.01**, 0.05, 0.1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name TransR \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \\\n",
    "--gamma 18.0 --lr 0.01 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 8\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: **200**, 400\n",
    "\n",
    "- gamma: 6, 12, **18**\n",
    "\n",
    "- lr: 0.01, **0.05**, 0.1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name TransR \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \\\n",
    "--gamma 18.0 --lr 0.05 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 9\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: **200**, 400\n",
    "\n",
    "- gamma: 6, 12, **18**\n",
    "\n",
    "- lr: 0.01, 0.05, **0.1**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name TransR \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \\\n",
    "--gamma 18.0 --lr 0.1 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 10\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: 200, **400**\n",
    "\n",
    "- gamma: **6**, 12, 18\n",
    "\n",
    "- lr: **0.01**, 0.05, 0.1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name TransR \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \\\n",
    "--gamma 6.0 --lr 0.01 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 11\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: 200, **400**\n",
    "\n",
    "- gamma: **6**, 12, 18\n",
    "\n",
    "- lr: 0.01, **0.05**, 0.1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name TransR \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \\\n",
    "--gamma 6.0 --lr 0.05 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 12\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: 200, **400**\n",
    "\n",
    "- gamma: **6**, 12, 18\n",
    "\n",
    "- lr: 0.01, 0.05, **0.1**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name TransR \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \\\n",
    "--gamma 6.0 --lr 0.1 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 13\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: 200, **400**\n",
    "\n",
    "- gamma: 6, **12**, 18\n",
    "\n",
    "- lr: **0.01**, 0.05, 0.1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name TransR \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \\\n",
    "--gamma 12.0 --lr 0.01 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 14\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: 200, **400**\n",
    "\n",
    "- gamma: 6, **12**, 18\n",
    "\n",
    "- lr: 0.01, **0.05**, 0.1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name TransR \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \\\n",
    "--gamma 12.0 --lr 0.05 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 15\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: 200, **400**\n",
    "\n",
    "- gamma: 6, **12**, 18\n",
    "\n",
    "- lr: 0.01, 0.05, **0.1**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name TransR \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \\\n",
    "--gamma 12.0 --lr 0.1 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 16\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: 200, **400**\n",
    "\n",
    "- gamma: 6, 12, **18**\n",
    "\n",
    "- lr: **0.01**, 0.05, 0.1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name TransR \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \\\n",
    "--gamma 18.0 --lr 0.01 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 17\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: 200, **400**\n",
    "\n",
    "- gamma: 6, 12, **18**\n",
    "\n",
    "- lr: 0.01, **0.05**, 0.1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name TransR \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \\\n",
    "--gamma 18.0 --lr 0.05 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 18\n",
    "\n",
    "- batch_size: **4096**\n",
    "\n",
    "- neg_sample_size: **256**\n",
    "\n",
    "- hidden_dim: 200, **400**\n",
    "\n",
    "- gamma: 6, 12, **18**\n",
    "\n",
    "- lr: 0.01, 0.05, **0.1**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \\\n",
    "--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \\\n",
    "--model_name TransR \\\n",
    "--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \\\n",
    "--gamma 18.0 --lr 0.1 --max_step 100000 -adv --regularization_coef 1.00E-07 \\\n",
    "--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \\\n",
    "--valid --test \\\n",
    "--batch_size_eval 128 --neg_sample_size_eval 10000 \\\n",
    "--log_interval 20000 --eval_interval 50000 --num_thread 32"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.16"
  },
  "vscode": {
   "interpreter": {
    "hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
